home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
007
/
chasm407.arc
/
CHASM.DOC
< prev
next >
Wrap
Text File
|
1987-01-10
|
164KB
|
6,463 lines
(tm)
CHASM
Cheap Assembler
for the IBM Personal Computer
User's Manual
(c) 1985 by David Whitman
Version 4.07
David Whitman
P.O. Box 1157
North Wales, PA 19454
(215) 641-7522 (days)
(215) 234-4084 (evenings)
Table of Contents
Why CHASM?..................................................1
What can CHASM do?..........................................2
What WON'T it do?...........................................3
System Requirements.........................................4
Advanced and Subset CHASM...................................5
Modifying CHASM's I/O Defaults..............................6
Syntax.....................................................11
Labels.....................................................13
Operands...................................................16
Operand Expressions........................................22
Resolution of Ambiguities..................................25
Pseudo-Operations..........................................29
Macros.....................................................38
Structures.................................................48
8087 Support...............................................52
Outside the Program Segment................................54
Running CHASM..............................................55
Error and Diagnostic Messages..............................59
Execution of Assembled Programs............................64
Notes for Those Upgrading to This Version of CHASM.........75
Miscellaneous and A Word From Our Sponsor..................78
=============================================================
Appendix A: 8088 Mnemonic List.............................83
Appendix B: 8087 Mnemonic List.............................85
Appendix C: Differences Between CHASM and TOA..............86
Appendix D: Description of Files...........................90
Appendix E: Bug Reporting Procedure........................91
Appendix F: Using CHASM on "Compatible" Systems............92
Advanced Version Order Form................................94
1
>>Why CHASM?<<
Why go to the trouble to write an assembler, when one already
exists? The IBM Macro Assembler is a very powerful software tool
available off the shelf. It supports features such as macros,
multiple segments, and linking to external procedures.
Unfortunately, the cost of all this power is complexity. The
Macro Assembler is so complicated that IBM warns beginners it is
only suitable for "experienced assembly language programmers".
For most users, this sophistication is more of a hindrance than
an aid. Even when writing short, simple programs, the user is
saddled with a set of confusing pseudo-ops only appropriate for
large, multi-segment programs. Producing a fast loading
executable file requires running three separate programs (MASM,
LINK and EXE2BIN) before you can get down to testing.
The macro assembler is totally unsuitable for use with BASIC.
Although it is *possible* to produce machine language BASIC
subroutines with the Macro Assembler, the process is incredibly
convoluted and confusing.
To top it all off, the Macro Assembler costs an overpriced $125.
CHASM is, I hope, a more reasonable compromise between power and
accessibility. CHASM is simple to use and understand. Unlike
the Macro Assembler, CHASM doesn't require a second LINK step to
produce a working program. CHASM also produces fast loading
programs without the use of the utility EXE2BIN.
CHASM supports several different simple mechanisms for getting
machine language routines into BASIC and Turbo Pascal, the two
most popular languages for the IBM Personal Computer.
Finally, the suggested payment for CHASM is a modest $40.
A Note for Beginners:
Before going on, you might find it useful to print and read the
file PRIMER.DOC, included on your CHASM disk. PRIMER is a gentle
introduction to assembly language, which will teach you some of
the vocabulary and key concepts you will need to start out with.
2
>>What can CHASM do?<<
CHASM is a tool for translating assembly language into the native
machine language of the 8088 microprocessor. Using CHASM, you
can write down easy to remember "mnemonics" which are then
converted into the relatively incomprehensible series of ones and
zeros that your PC prefers to work with.
In addition to simple mnemonic translation (such as provided by
the mini-assembler in DEBUG), CHASM provides a great many
"convenience" features which make writing machine language much
easier.
CHASM allows you to define labels for branching, rather than
requiring you to figure out offsets or addresses to jump to.
CHASM also lets you give symbolic names to any constants or
memory locations you use, to make your program easier to
understand.
You can instruct CHASM to make your file "BLOADable" so that
BASIC can load it as a subroutine. A utility is also provided
to convert machine language into BASIC "DATA" statements, so that
BASIC can "POKE" routines into memory. Similarly, for Turbo
Pascal CHASM can produce external procedures and functions, or a
file of Turbo "INLINE" statements.
CHASM has intelligent error and diagnostic messages which guide
you in correcting mistakes and ambiguities in your program. A
nicely formatted listing is produced during assembly, to help
during debugging.
In general, CHASM is designed to eliminate much of the confusion
and dirty work involved in writing machine language for the IBM
PC.
Using CHASM, you can produce:
1. Lightning fast "stand-alone" programs.
2. Machine language subroutines for BASIC programs, both
interpreted and compiled.
3. Machine language procedures and functions for Turbo Pascal.
3
>>What WON'T it do?<<
In the interest of simplicity, CHASM has the following
restrictions:
1. Multiple segment definitions are not allowed. CHASM assumes
that your entire program fits in one segment, that the cs, ds,
and es registers all point to this same segment, and that the
ss register points to a valid stack area. An equivalent
statement is that CHASM produces COM files, not EXE files.
2. Linking to Microsoft languages is not supported. You can't
use CHASM to produce object modules for use with IBM/Microsoft
Pascal or FORTRAN.
4
>>System Requirements<<
Minimum system requirements to use CHASM are:
IBM PC or true compatible (must emulate IBM BIOS)
128K of memory (192K for IBM PCjr)
1 disk drive
80 column display
DOS 2.0 or later.
Note the DOS 2 requirement. To provide *true* DOS 2 support, it
was necessary to give up DOS 1 compatibility. If you're still
using DOS 1, you'll need to upgrade to DOS 2.0 or later to use
CHASM.
Adding more memory will allow you to assemble larger programs.
CHASM can take advantage of all available memory, up to a
megabyte.
CHASM will run faster if your source files and object files are
on a hard disk or RAM disk.
If you have a non-IBM computer, please read about the /VIDEO
switch in "Modifying CHASM's I/O Defaults" and also the appendix
titled "Using CHASM on Compatible Systems"
5
>>Advanced and Subset CHASM<<
CHASM is available in two flavors, Advanced and Subset.
The two versions vary in their capabilities and method of
distribution.
The subset version is the budget release. It may be freely
copied by individuals as "user-supported" software, and is
available from user groups and bulletin boards across the
country. Every time the subset runs, it prints a banner page
suggesting a payment of $40 to the author. As its name
suggests, Subset CHASM does not support all the features of
Advanced CHASM.
Advanced CHASM is the deluxe release. It runs twice as fast, and
has a number of features not supported by the subset:
macros
conditional assembly
8087 support
include files
operand expressions
structures
Advanced CHASM is only available directly from Whitman Software.
Users who make the suggested $40 payment receive an upgrade to
Advanced CHASM. Details, and an order blank are given at the
end of this document.
Throughout this document, features supported only by the
advanced version will be marked as follows:
==>Advanced version only.
Attempts to use these features in the subset version will
generally result in error messages, with the advanced feature
otherwise being ignored; however, unpredictable behavior could
result in some instances.
6
>>MODIFYING CHASM'S I/O DEFAULTS<<
CHASM supports the use of a configuration file, which can be used
to override some of CHASM's I/O defaults. If you are willing to
accept CHASM's defaults, you may skip this section - on most
systems, CHASM will work perfectly well without the file
described below.
The configuration process involves supplying a text file named
CHASM.CFG on your default drive. You can create this file with
any text editor or word processor.
You have some freedom in where you put the configuration file.
If CHASM.CFG is not found in the current directory, CHASM will
also try the following paths:
\CHASM\CHASM.CFG
\CHASM.CFG
The file should contain a series of text "switches" of the
following form:
/SWITCH, xx [, yy, zz,...]
Where "/SWITCH" is a reserved word, listed below, and xx, yy, zz
are numbers. The brackets around yy and zz indicate that these
numbers are optional - don't put brackets in CHASM.CFG.
Each switch should start a new line, and the switch and each of
the numbers should separated by one or more blanks or commas.
The following switches are implemented:
/VIDEO Video access method. With this switch, you can
control the method CHASM uses to write data on your
video screen. Three settings are currently
supported:
/VIDEO 0 Direct write to screen memory.
/VIDEO 1 Direct write to screen memory, but only
during the horizontal retrace interval.
/VIDEO 2 Write using BIOS calls.
7
/VIDEO (continued) In general, the lower a number you
specify for /VIDEO, the faster CHASM will run.
/VIDEO 0 is the fastest mode, and is intended for use
with either the IBM monochrome display adapter or the
EGA adapter. You can use /VIDEO 0 with the CGA
adapter, but some annoying "snow" and flicker may
result.
/VIDEO 1 is intended for use with the CGA adapter.
It is almost as fast as /VIDEO 0, and the snow is
eliminated. /VIDEO 1 is CHASM's default mode.
/VIDEO 2 is intended for non-IBM systems that emulate
the IBM BIOS, but have significantly different screen
hardware. If you have trouble running CHASM on a
"compatible" system, try setting /VIDEO 2. Use of
/VIDEO 2 will slow CHASM down significantly.
/FG Foreground color. Users with color monitors may
select a foreground color from the following list:
0 Black 8 Gray
1 Blue 9 Light Blue
2 Green 10 Light Green
3 Cyan 11 Light Cyan
4 Red 12 Light Red
5 Magenta 13 Light Magenta
6 Brown 14 Yellow
7 White 15 High Intensity White
Example: (Magenta)
/FG 5
/BG Background color. Selections 0-7 above are
available. Example: (Cyan)
/BG, 3
8
/132 Printer 132 column mode. The numbers following this
switch are the ASCII codes for the characters which
cause your printer to go into condensed mode. You
may specify as many characters as you like. If you
don't provide this switch, CHASM will truncate source
lines in listings to avoid going over 80 columns. You
can also include characters to activate any special
features of your printer you want CHASM to use during
printing. Example: (for IBM printer)
/132, 15
/80 Printer 80 column mode. Similar to /132, but the
numbers represent the characters to return your
printer to normal. Include the codes for any
characters you want CHASM to send to your printer
before returning you to DOS. The following example
turns off condensed mode and causes a Form Feed on
the IBM printer:
/80, 18, 12
/FF Formfeed. A value of zero tells CHASM that your
printer doesn't recognize ASCII 12 as a formfeed
command. Any other value, and CHASM assumes that
formfeeds will be recognized. The default is no
formfeed character.
If /FF is off, CHASM will simulate formfeeds with a
series of linefeeds. However, most printers respond
quicker to a formfeed than to multiple line feeds, so
set /FF on if possible. Example: (on)
/FF 1
/PAGELEN CHASM assumes that there are 66 lines to each printed
page. If you use different sized paper, enter the
number of lines per page after this switch.
Example: (European standard)
/PAGELEN 72
9
/LINES By default, CHASM will print 58 lines on each printed
page, then skip over the perforation to the next
page. You can change this number to suit your paper
and personal taste. Example:
/LINES 50
/BEEP Enables/Disables audible warning when errors are
discovered in your source program. A value of zero
turns off beeping, anything else turns it on. The
default is on. Example: (off)
/BEEP 0
/DWIDTH When listing to a disk file, CHASM normally truncates
listing lines to 80 columns to prevent wrap-around
when viewing the file. In some instances, such as
disk-based print spooling with DOS's PRINT utility,
you may wish to override this truncation. You can
enter a new truncation limit after this switch.
Example: (128 column lines)
/DWIDTH 128
/DPAGE For easier scanning, listings sent to disk files do
not normally contain page breaks. If you want to
produce printable listing files, you can turn on page
breaks by setting /DPAGE to any value other than
zero. Example: (page breaks on)
/DPAGE 1
10
/PATH Path Strategy. Affects the way CHASM constructs the
default file names for list and object files. If
/PATH is set to 0 (the default), any drive or path
specified on the source file name will be included in
the default file names. If /PATH is set to anything
else, path and drive info are removed, thus putting
the default list and object files in the current
directory of the logged drive. The following table
shows how this works:
/PATH | Source filename | Default object file
------|-------------------|-------------------------
0 | a:\chasm\test.asm | a:\chasm\test.com
1 | a:\chasm\test.asm | test.com
/PATH only affects how the default file names are
constructed - CHASM won't edit filenames that you
explicitly type in.
/TIMER Enables/Disables reporting of total assembly time. A
value of 0 turns off timing, anything else turns it
on. The default is off. This switch is used
internally at Whitman Software to benchmark new
versions of CHASM.
A sample CHASM.CFG file, suitable for use with the IBM dot matrix
printer, is included on your CHASM distribution disk.
11
>>Syntax<<
CHASM accepts a standard DOS text file for input. In this
context, "standard DOS text file" means:
1. Lines are terminated with a CR/LF pair.
2. The end of the file is marked with an EOF marker (cntl-Z).
Virtually every word processor or editor produces files which
meet these criteria. Some word processors have to be in a
special mode to produce standard DOS files. (For example, in
WordStar, you have to be in "non-document" mode.) The user's
manual for your editor should inform you if any special mode
needs to be activated.
Some users have reported problems with the user-supported word
processor PC-Write. Under certain circumstances, older
versions of PC-Write can produce lines ending with just a LF
(violation of rule 1) or files which aren't terminated with an
EOF marker (violation of rule 2). If you use PC-Write, make sure
you have the latest version.
Lines may be any combination of upper and lower case characters.
CHASM does not distinguish between the two cases: everything
except single quoted strings are automatically converted to upper
case during the parsing process. Thus, BUFFER, Buffer, buffer,
and bUFFer all refer to the same symbol.
The characters blank ( ), comma (,), single quote (')
semi-colon (;) and TAB are reserved, and have special meaning to
CHASM (see below).
Each line must be no more than 80 characters long and have the
following format:
Label Operation Operand(s) ;comment
The different fields of an input line are separated by the
delimiters blank ( ), comma (,) or TAB. Any number of any
delimiter may be used to separate fields.
12
Explanation of Fields:
Label: A label is a string of characters, beginning in column 1.
Depending on the operation field, the label might represent a
program location for branching, a memory location, or a
numeric constant (see the section titled "Labels" for more on
this topic). To ensure that CHASM can distinguish between
labels and numeric constants, the first character of a label
must *not* be a number (0-9), plus sign (+) or minus sign(-).
Anything beginning in column 1, except a comment, is
considered a label.
Operation: Either a pseudo-op (see the section with the same
name) or an instruction mnemonic as defined in "The 8086
Book" by Rector and Alexy. A list of acceptable mnemonics is
given in Appendix A.
Note 1: Except as modified below,"The 8086 Book" is the
definitive reference for use with CHASM.
Note 2: There are several ways to resolve some ambiguities in
8086 assembly language. Please read page 3-285 of The 8086
Book, and "Resolution of Ambiguities" in this document.
Operand(s): A list of one or more operands, as defined in
the section titled "Operands", separated by delimiters.
Comment: Any string of characters, beginning with a semicolon
(;). Anything to the right of a semicolon will be ignored by
CHASM.
Note that except for the case of an operation which requires
operands, or the EQU pseudo-op which requires a label, all of the
fields are optional. The fields MUST appear in the order shown.
13
>>Labels<<
The symbol in the label field of an input line can be interpreted
in four different ways by CHASM:
1. A program location which may be branched to.
2. A memory location for data storage.
3. An equated symbol, which takes the place of a numeric
constant.
4. A macro name, which takes the place of a series of
frequently used source code lines.
The default interpretation of a symbol is a program location for
branching. This default is modified by the presence of one of
the following pseudo-ops in the instruction field:
DB, DM, DS, or DW:
Normal: The symbol is a memory location.
In a Structure: The symbol is a numeric constant.
EQU:
Normal: The symbol is a numeric constant.
Memory Option: The symbol is a memory location.
MACRO:
The symbol is a macro name.
A given symbol may have only ONE of the above interpretations!
Attempts to branch into a memory location or an equated symbol
will result in error messages. Similarly, CHASM will not allow
you to treat program code as a data area. Examples:
TEXT DB 'Hit any key when ready' ;memory location
MOV AL,TEXT ;ok
JMP TEXT ;wrong!
LOOP MOV AX,CX ;program location
JMP LOOP ;ok
MOV AX, LOOP ;wrong!
14
If for some arcane reason you *need* to branch into a data area,
you can fool CHASM by placing a label on an otherwise blank line,
immediately before the data area. Example:
JNZ NPU
RET
NPU ;dummy label for jump
DB D8H, 00H ;an 8087 instruction
If you have a masochistic urge to crash your system by writing
self-modifying code, there are at least two ways you can defeat
CHASM's injunction against using program code as a data area.
The first way is to use DS to declare zero bytes of storage
immediately before the code you want to access. A label on the
null DS will have the same offset as the immediately following
code. Example:
MOV JUNK1, 9090 ;change endless loop to NOP
JUNK1 DS 0
JUNK JMP JUNK
A sneakier approach is to load the OFFSET of a program
location into a register, then use the register for indirect
addressing. Using the optional displacement field, you can even
address the middle of an instruction. Examples:
MOV BX, OFFSET(CALL)
MOVB 1[BX], 00H ;change interrupt number in code
CALL INT 0
In general, I cannot recommend trying to get around CHASM's type
restrictions. If you find yourself in a situation where it
seems necessary to fool CHASM, there's probably a safer, more
direct way to legally program what you're trying to accomplish.
Labels can be up to 80 characters long, but only the first 15
characters are significant. For example, CHASM considers the
following labels identical:
VERYLONGLABELOVER15CHARACTERS
VERYLONGLABELOVER
15
To avoid ambiguity, a given string can legally appear in the
label field of only ONE statement. If the same string appears on
more than one instruction, all instances after the first will
receive an error message. Remember that labels are only
significant to 15 characters, and if the first 15 characters are
identical, an error will occur.
TWO EQU '2' ;first use is ok
TWO PROC FAR ;wrong! symbol already defined
CHASM has a number of reserved strings which are "predefined" in
the symbol table, and will generate an error message if used as
labels. All the register names are reserved, as are the indirect
address mode combinations. The only other reserved strings are
the words "NEAR" and "FAR", and the symbol "$". Examples:
AX MOV AX, DX ;wrong! (register name)
[DI] ADD AX, BX ;wrong! (indirect address)
FAR CALL GETINPUT ;wrong! (reserved word)
$ SUB AX, DX ;wrong! (reserved word)
16
>>Operands<<
The following operand types are allowed.
1. Immediate data: A number, stored as part of the program's
object code. Immediate data are classified as either byte,
expressible as an 8 bit binary integer; or word, expressible
as a 16 bit binary integer. If context requires it, CHASM
will left-pad byte values with zeroes to convert them to word
values. Attempts to use a word value where only a byte will
fit will cause an error message to be printed.
Immediate data may be represented in 9 ways:
A. An optionally signed decimal number in the range -32768
to 32767. Examples:
MOV AL, 21
MOV BX, -6300
B. A series of up to 4 hex digits, followed by the letter
H. The first digit must be non-alphabetic, i.e. in the
range 0-9, to allow CHASM to distinguish between numbers
and symbols. If necessary, a leading zero, which does
not count in the four allowed digits, may be added to
fulfill the non-alphabetic condition.
If the RADIX pseudo-op has been used to change the
default number base to 16, the final H can optionally be
omitted. Examples:
ADD CX, 0B123H ;leading zero required
RADIX 16 ;change default number base
ADD DL, 12 ;same as 12H
C. A series of up to 16 binary digits, followed by the
letter B. Examples:
MASK EQU 00000111B
MOV AL, 10000000B
17
D. A symbol representing types A, B or C above, defined
using the EQU pseudo-op. Examples:
MASK EQU 10H
MAX EQU 1000
AND CL,MASK
SUB AX,MAX
E. The offset of a label or storage location returned by
the OFFSET operator. OFFSET always returns a word
value. OFFSET is used to get the address of a named
memory location, rather than its contents. Example:
MOV DI,OFFSET(BUFFER)
BUFFER DS 0FFH
F. The ASCII value of a printable character, represented by
the character enclosed in single quotes ('). Thus, the
following lines will generate the same object code:
MOV AL,41H ;ascii code for 'A'
MOV AL,'A'
G. When producing Turbo Pascal INLINE code, the function
TURBO(x) assembles as 16 bit immediate data.
H. ==>Advanced version only:
Labels within structures become immediate operands whose
values equal their offset within the structure. See the
section titled "Structures" for more detail and
examples.
I. ==>Advanced version only:
The length of a structure, returned either by the LENGTH
operator, or simply the structure's name. See the
section titled "Structures" for more detail and
examples.
J. An expression that evaluates out to type Immediate.
Examples:
MOV AX, (3+2)*5
MOV DX, MEMLOC1-MEMLOC2
See the "Operand Expressions" section for more details.
18
2. Register Operands
A. An 8 bit 8088 register from the following list:
AH AL
BH BL
CH CL
DH DL
B. A 16 bit 8088 register from the following list:
AX BX CX DX SP BP SI DI
C. An 8088 segment register from the following list:
CS SS DS ES
D. An 8087 stack register from the following list:
ST ST(1) ST(2) ST(3) ST(4) ST(5) ST(6) ST(7)
Note: ST can also be referenced as ST(0).
3. Memory Operands: The contents of a memory location addressed
by one of the following methods. Note that none of the memory
addressing options specifies the *size* of the operand. 8088
instructions have word or byte sized operands, and 8087
instructions can have word, short, long or temporary real
operands. See the section titled "Resolution of Ambiguities"
for more on this topic.
All memory operands can optionally be preceded with a segment
override to specify the segment register to be used in
accessing memory. The override takes the form of a segment
register, followed by a colon, followed by the memory operand.
No intervening delimiters are allowed. Examples:
MOV AX, ES:[80H] ;80H into the extra segment
MOV CS:[DI], SI ;indirect with DI in the code segment
Segment overrides are also discussed in the section titled
"Resolution of Ambiguities".
19
A. Direct Memory Address.
1. A number or other immediate operand, enclosed in
brackets, indicating an offset into the data segment.
Example:
BUFFER EQU 5A5AH
MOV BH, [BUFFER]
MOV [80H], DI
MOV AX, CS:[TURBO(I)]
2. A symbol, defined to be a variable (i.e. a named memory
location) using the EQU pseudo-op. Example:
FCB EQU [80H]
MOV DI,FCB
3. A symbol, defined to be a variable by its use on a
storage defining pseudo-op. Examples:
MOV AX,FLAG
MOV DATE,BX
FLAG DS 1
DATE DB 31
4. The special symbol '$'. $ returns an address of value
equal to the current setting of CHASM's location
counter. This address can be used as either a memory
location or a program location for branching. Used by
itself, $ has little utility, but it is a very powerful
tool when used in expressions. Example:
MOV AX, $+4 ;get the byte after this instruction
5. ===> Advanced version only:
An expression that evaluates out to type address.
Example:
MOV AX, buffer+3
BUFFER DS 20
See the "Operand Expressions" section for more details.
20
B. Indirect Memory Address: The address of the operand is the
sum of the contents of the indicated register(s) and a
displacement. The register, or sum of registers, are
enclosed in square brackets: []
The displacement is optional, and takes the form of an
immediate operand, placed without intervening delimiters to
the left of the first bracket.
Immediate data written as decimal numbers, with values
between 127 and -128 generate a signed 8 bit offset.
Values outside this range, or those expressed in some
other manner generate 16 bit offsets.
The following indirect modes are provided:
1. Indirect through a base register (BX or BP). Examples:
ENTRYLENGTH EQU 6
MOV AX, ENTRYLENGTH[BP]
MOV DL, -2[BX]
MOV CX, TURBO(COUNT)[BP]
MOV 9A9AH[BX], AX
2. Indirect through an index register (DI or SI).
Examples:
MOV [DI], CX
MOV CX, -5[SI]
3. Indirect through the sum of one base register and one
index register. Examples:
MOV [BP+DI], SP ;note that no spaces are
MOV BX, 10H[BX+SI] ;allowed within the
MOV CL, [BP+SI] ;brackets.
MOV DH, -2[BX+DI]
21
4. Labels
A label on most instructions may be used as an operand for
call and jump instructions. See the section titled "Labels"
for more information. Examples:
START PROC NEAR
CALL GETINPUT
JMPS START
ENDP
GETINPUT PROC NEAR
5. Strings
A string is any sequence of characters (including delimiters)
surrounded by single quotes ('). If you want to use the
single quote character inside a string, put two of them
in a row for each one you want in the string. Example:
DB 'This is a ''string'' with embedded quotes'
22
>>Operand Expressions<<
CHASM can perform arithmetic calculations at assemble time, to
help you generate memory addresses and immediate data constants.
The following operand types can participate in operand
expressions:
immediate data
program locations
data storage locations
Actually, for the purpose of expression evaluation, CHASM treats
both program locations and data storage locations identically.
Thus, really only two types of operands are allowed in
expressions: immediate and address.
Symbols standing for immediate operands must have been defined
prior to use in an expression, or you may get Phase Errors (see
the error message section for a discussion on this topic). You
should put all your EQU and STRUC pseudo-ops at the beginning of
your program, before any machine instructions.
The following arithmetic operators are available:
+ addition
- subtraction
* multiplication
/ integer division
Note that / performs *integer* division. 5/2 evaluates to 2.
The normal rules of precedence apply. To force evaluation
contrary to normal precedence, you can use parenthesis: ().
MOV AX 2+5*3 ;means mov ax, 17 because * outranks +
MOV AX, (2+5)*3 ;means mov ax, 21
No delimiters can appear within an expression. If you separate
the parts of an expression with delimiters, CHASM will try to
interpret each part as a separate operand.
MOV [BX], BUFFER + 3 ;WRONG!!
MOV [BX], BUFFER+3 ;ok
23
The type of an operand expression is determined by the individual
operand types that are combined within it. In evaluating each
operation in an expression, the result type follows this rule:
If the two operand types are the same, the result type is
IMMEDIATE.
If the two operand types are different, the result type is
an ADDRESS.
In expressions, both program locations and data storage locations
are considered identical. An expression which evaluates to type
address can be used in either a MOV or a JMP.
The result type of a complicated expression is determined by
replacing individual binary operations with their result, and
evaluating the (simpler) expression thus formed, following the
above rules.
Here's a program fragment with some expressions, and a discussion
of their types and significance:
EXPSAMPLE PROC NEAR
MOV AX, 3+5 ;immediate, calculated value
MOV DX, STRING-END ;immediate, length of string
MOV STRING+5, 'A' ;address, fifth byte of string
JMP EXPSAMPLE+100H ;address, used as program loc.
JMP STRING+100H ;valid address syntax, but silly
STRING DB 'MESSaGE' ;data area
END ;marks position of end of string
The special symbol '$' returns the current value of CHASM's
location counter for use in expressions. $ is of type address.
Thus, $-1 is the address of the byte immediately prior to the
instruction currently being assembled.
24
The rules for typing expressions were set up to produce the
"most useful" result type, taking a guess as to why one would
want to do a given calculation. If the type isn't what you have
in mind, you can coerce CHASM using square brackets or the
OFFSET function:
MOV AX, 3+5 ;immediate
MOV AX, [3+5] ;coerced to address
MOV DX, $+4 ;address
MOV DX, OFFSET($+4) ;coerced to immediate
Under certain circumstances, CHASM can get confused when you use
OFFSET or LENGTH functions in expressions. The problem occurs if
the expression starts with a function, and ends with a close
parenthesis:
MOV AX, OFFSET(BUFFER)*(3+2) ;will be misinterpreted
CHASM will try to interpret this as the offset of "BUFFER)*(3+2",
and you'll get the error message "Illegal or undefined argument
for OFFSET". The solution is to enclose the whole works in
parenthesis, to force CHASM to recognize it as an expression:
MOV AX, (OFFSET(BUFFER)*(3+2)) ;correct
25
>>Resolution of Ambiguities<<
The language defined in "The 8086 Book" contains a number of
ambiguities which must be resolved by an assembler. This is
discussed throughout the book, but pages 3-285 and 3-286
specifically cover this topic. CHASM's solutions of these
problems are discussed in this section.
A. 8088 Memory references:
When one specifies the address of a memory location, it is
unclear how large an operand is being referenced. An operand
might be a byte, or a word.
1. If a register is present as an operand, it is assumed that
the memory operand matches the register in size. An
exception to this rule are the shift and rotate
instructions, where the CL register is used as a counter,
and has nothing to do with the size of the other operand.
Examples:
MOV MASK, AX ;mask is a word
MOV DH, [BX] ;BX points to a byte
NEG [SI] ;error, operand of unknown size
SHR FLAG, CL ;error, flag is of unknown size
2. If no register is present, (or if the only register is CL
being used as a counter) the size of the memory operand is
specified by adding the suffix "B" or "W" to the
instruction mnemonic. Examples:
NEGB [SI] ;SI points to a byte
SHRW FLAG, CL ;flag is a word
MOVW MASK, 0AH ;mask is a word
MOVB MASK, 0AH ;mask is a byte
MOVW MASK, 9A9AH ;must specify size even though
;immediate operand implies word
26
B. 8087 Memory References:
All real and integer 8087 memory references are ambiguous as
to the operand size. Integer operands could be word, short,
or long. Reals can be short, long or temporary real. As with
8088 memory references, you specify the size using a suffix:
W: word
S: short
L: long
T: temporary real
For more details and examples, see the section on 8087
support.
C. Indirect Branching.
The 8088 supports two flavors of indirect branching: intra, and
intersegment. A register is set to point at a memory location
which contains a new value for the program counter, and in the
case of intersegment branching, a new value for the CS register
as well.
The syntax of "The 8086 Book" does not specify which flavor of
branch is being invoked. CHASM adds the suffixes "N" (for near,
or intrasegment) and "F" (for far, or intersegment) to the
indirect CALL and JMP mnemonics. Examples:
CALLN [BX] ;intrasegment call
JMPF [DI] ;intersegment jump
JMP [BP] ;error, unspecified flavor
D. Long and Short Jumps
Two types of relative jumps are supported by the 8088: short
(specified by a signed 8 bit displacement) and long (specified by
a 16 bit displacement). Both are implemented in CHASM as a jump
to a label.
The short jump is specified by mnemonic JMPS. Since one of the
displacement bits is used to indicate direction, only seven are
left to express the magnitude of jump. JMPS (and similarly, all
the jump on condition instructions) is thus limited to branching
to labels within a range of -128 to +127 bytes.
27
CHASM reserves mnemonic JMP for the long jump. JMP may be used
to jump anywhere within the program segment, but the object code
generated is less compact than that from JMPS.
Examples:
START PROC NEAR
JMPS END ;short jump
JMP START ;long jump
END ENDP
E. Instruction Prefixes.
The 8088 supports three instruction prefixes:
1. SEG: segment override. An alternate segment register is
specified for a reference to memory.
2. REP, REPE,REPNE,REPZ,REPNZ: repeat. A string primitive is
repeated until a condition is met.
3. LOCK: Turns on the LOCK signal. Only useful in
multi-processor situations.
SEG is implemented as a modifier attached to a memory operand.
If you want to override the default segment register for a memory
access, precede the memory operand with the segment register
followed by a colon. Examples:
MOV AX, ES:FLAG ;flag is in the extra segment
MOV CS:[100H], DX ;offset 100H in the code segment
28
The other prefixes are implemented as separate instructions. They
appear on a separate line, immediately before the instruction
which they modify. For compatibility with earlier versions of
CHASM, you can also specify segment overrides on a separate line,
using the mnemonic SEG. Examples:
REP
MOVSB ;move bytes until CX decremented to 0
SEG SS
MOV AX,BUFFER ;buffer is in the stack segment
LOCK
MOV [8FFH], AX ;lock bus to ensure data is transferred
;before other processors try to access it
Note for 8087 users: You may get unexpected results using the
separate instruction form of SEG on 8087 instructions. Use
the operand modifier form for segment overrides on 8087
instructions.
29
>>Pseudo-Operations<<
The following pseudo-ops are implemented:
A. BSAVE: Generate object code in BSAVE format.
Instructs CHASM to build a header in the format of BASIC's
BSAVE command. The resulting object code file may be BLOADed
by BASIC programs. No operands are required, and the
pseudo-op may appear anywhere within the source code.
Example:
ORG 0 ;no psp
SUBRT PROC FAR ;subroutine for BASIC program
BSAVE ;make BLOADable object file
B. COUNT ...ENDC: Count bytes.
COUNT was available in earlier versions of CHASM. With the
addition of operand expression support, COUNT became obsolete,
and was eliminated to make room for new features.
Existing programs using COUNT can be modified as in the
following example. The length of the string is calculated by
subtracting two labels, one just before the string, one just
after it:
MESSAGE COUNT
MSG_TXT DB 'This utility requires DOS 2.0!' beep cr lf
ENDC
MOV CX, LENGTH(MESSAGE)
becomes:
MSG_TXT DB 'This utility requires DOS 2.0!' beep cr lf
MSG_END
MOV CX, MSG_END-MSG_TXT
30
C. DB: Declare Bytes
Memory locations are filled with values from the operand list.
Any number of operands may appear, but all must fit on one
line. Acceptable operands are immediate data, or strings
enclosed in single quotes ('). DB interprets strings as a
series of ASCII bytes.
If a label appears, it is redefined as a memory location, and
the data area may be referred to using the label, rather than
an address. Examples:
MOV AX,MASK
MASK DB 00H,01H
STG DB 'A string operand'
CHASM generates an error message ("Data too large") if word
operands (such as OFFSETs, or numbers greater than 255) are
found in the DB operand list. DW should be used for declaring
words.
D. DM: Declare Multiple Bytes
Like COUNT, DM became obsolete when operand expressions became
available, and has been eliminated to make room for new
features. Existing programs using DM can be modified as
follows:
DM 500, ENTRYLENGTH
becomes:
DS 500*ENTRYLENGTH
31
E. DS: Declare Storage
Used to declare large blocks of identically initialized
storage. The first operand is required, a number specifying
how many bytes are declared. If a second operand in the form
of a number 0-FFH appears, the locations will all be
initialized to this value. If the second operand is not
present, locations are initialized to 0. As with DB, any
label is redefined as a memory location. To save space, the
object code does not appear on the listing. Examples:
DS 10 ;10 locs initialized to 0
DS 100H,1AH ;256 locs initialized to 1AH
F. DW: Declare Words
Used to unambiguously assign a word of storage for each item
in the operand list. Any number of immediate operands may
appear, but all must fit on one line. As with DB, any label
is redefined as a memory location. Example:
DW 0012H, FFFFH ;four bytes declared
G. EJECT: Begin New Print Page
When listing is enabled, causes CHASM to move to the top of
the next listing page. Normally has no effect on listings
sent to the screen or to disk, although you can enable EJECTs
in disk listings with CHASM's /DPAGE configuration switch.
H. EJECTIF: Conditional Page Break
Requires one immediate operand. If listing is enabled
and fewer than that many lines are left on the current page of
the listing, CHASM will move to the top of the next page.
If you put an appropriate EJECTIF at the beginning of each
procedure or section of your programs, CHASM will keep them in
one piece. Like EJECT, EJECTIF normally has no effect on
listings to the screen or disk. You can enable EJECTIF for
disk listings with CHASM's /DPAGE configuration switch.
Example:
EJECTIF 20 ;following procedure is 20 lines long
32
I. ENDP: End of Procedure
See PROC (below) for details.
J. ENDSTRUC: End of Structure
==>Advanced version only. See STRUC (below) for details.
K. EQU: Equate
Used to equate a symbolic name with a number. The symbol may
then be used anywhere the number would be used. Use of
symbols makes programs more understandable, and simplifies
modification.
An alternate form of EQU encloses the number in square
brackets: []. The symbol is then interpreted as a memory
location, and may be used as an address for memory access.
This version is provided to allow symbolic reference to
locations outside the program segment. Examples:
MOFFSET EQU 0B000H
MONOCHROME EQU [0000H]
Warning: Difficult to debug errors may result from using a
======> symbol prior to its being defined by EQU. You are
strongly urged to group all your equates together at
the beginning of programs, before any other
instructions. See "Phase Error" in the Error Message
section.
L. INLINE: Generate Turbo Pascal inline statements
Instructs CHASM to output object code in the form of a text
file, suitable for including in Turbo Pascal inline
statements. The resulting object file is not directly
executable, but with minimal editing, can be compiled by Turbo
Pascal as inline data.
INLINE can appear anywhere in your source file, and requires
no operands.
See the "Execution of Assembled Programs" section of this
document for examples, and more details.
33
M. IFXX... [ELSE]... ENDIF: Conditional Assembly
===> Advanced version only.
CHASM's conditional assembly pseudo-ops can be used to cause
macros to expand differently according to the parameters
supplied on the invocation. Many different IF pseudo-ops are
provided, to allow testing for (in)equality, relative size,
and (non)existence of parameters. You can also test whether a
parameter is a register, and if so, what type.
For more details and examples, see the Macro section of this
document.
N. INCLUDE: Include file
==>Advanced version only.
INCLUDE requires one string operand, a filename enclosed in
single quotes. If desired, you can specify a drive and/or a
path as part of the filename.
The contents of the specified file are logically inserted into
the source file at the point where the INCLUDE appears.
INCLUDEs cannot be nested: an error message will be printed if
the specified file itself contains an INCLUDE. Example:
INCLUDE '\ASM\STDIO.HDR' ;bring in standard library
O. LIST: Enable listing output
Output to the list device is enabled, presumably after a
NOLIST was encountered. No operands required.
34
P. MACRO ...ENDM: Macro Definition
==> Advanced version only.
Declares a macro. MACRO requires no operands, and signals
CHASM that the following lines constitute a macro definition,
and are to be stored for later use, rather than assembled in
place. A label is required on the MACRO statement to name the
defined macro.
ENDM terminates the macro definition, and requires no
operands.
For more information and examples, see the section titled
"Macros".
Q. NOLIST: Disable listing output
Normal output to the list device is disabled. Error messages
are listed as usual. No operands required.
R. ORG: Origin
Allows direct manipulation of the location counter during
assembly. By default, CHASM assembles code to start at offset
100H, thus leaving room for the program segment prefix
normally built by COMMAND or DEBUG. In situations where no
PSP is provided, such as routines to be called from BASIC, you
should override this default with ORG, or incorrect assembly
may result.
ORG requires one operand, a number between 0 and FFFFH, which
represents the new setting of CHASM's location counter.
Although the location counter may be reset anywhere within a
program, generally this pseudo-op should be used before any
machine executable instructions for meaningful results.
Example:
ORG 0 ;Code will be assembled for starting
;offset of 0
35
S. PROC ...ENDP: Procedure Definition
Declares a procedure. One operand is required on PROC, either
the word NEAR, or the word FAR. This pseudo-op warns CHASM
whether to assemble returns as intra (near) or intersegment
(far). Procedures called from within the program being
assembled should be declared NEAR. Generally, all others
should be FAR. ENDP terminates the procedure, and requires no
operands. If a RET is encountered outside of a declared
procedure, an error occurs. Procedures may be nested, up to
10 deep. Example:
MAIN PROC FAR
...
... ;body of procedure
ENDP
T. RADIX: Default Number Base
CHASM's default radix is 10, meaning that numbers are assumed
to be in base 10 unless they end in "B" or "H". The RADIX
pseudo-op allows you to change this default. Allowed RADIX
values are 16 and 10. Setting RADIX 16 allows you to specify
hex numbers without the trailing "H".
Note that when RADIX 16 is in effect, there is no way to
specify numbers in base 1 (binary) or base 10 (decimal). To
write in either of these bases, shift back to RADIX 10. For
example:
RADIX 16
mov ax, 1B ;means 1B in hexadecimal
mov ax, 20 ;means 20 in hexadecimal
mov ax, 30H ;trailing "H" allowed, but not necessary
RADIX 10
mov ax, 1B ;means 1 in binary
mov ax, 20 ;means 20 in decimal
mov ax, 30H ;means 30 in hexadecimal
36
U. STRUC ...ENDSTRUC: Structure Definition
==> Advanced version only.
Declares a structure. STRUC requires no operands, and signals
CHASM that the following lines constitute a structure
template, and not actual storage declaration. If a label
appears on the STRUC, the label is equated with the length of
the structure.
ENDSTRUC terminates the structure definition, and requires no
operands.
Inside the structure, storage defining pseudo-ops behave
somewhat differently. See the section titled "Structures" for
more information. Example:
DIRENTRY STRUC ;disk directory entry
NAME DS 8
EXT DS 3
ATRIB DS 1
RESERVED DS 10
TIME DS 2
DATE DS 2
START DS 2
SIZE DS 4
ENDSTRUC
V. WAITON / WAITOFF: Toggle Automatic WAIT Assembly
CHASM normally assembles WAIT instructions before most 8087
instructions. After a WAITOFF pseudo-op is encountered, CHASM
will not add WAITs. This allows you to let the 8088 and 8087
run in parallel for greater speed, putting in WAITs manually
where synchronization is important. WAITON turns automatic
WAIT assembly back on.
W. CHKJMP / NOCHKJMP
Following a CHKJMP pseudo-op, CHASM will check each JMP
instruction to see if it could have been coded as JMPS (to
produce tighter code). JMP instructions with displacements
smaller than 128 bytes will be flagged with a diagnostic
message ("Could Use JMPS"). NOCHKJMP turns off JMP checking.
37
X. = : Assignment to Assembler Variable
CHASM supports the use of "assembler variables", which can be
dynamically redefined throughout your program. The assignment
operator "=" acts similarly to pseudo-op EQU. A label on a
"=" line will be defined as equivalent to the operand, until
it is redefined by another assignment statement. Unlike EQU,
with "=" you can assign any valid operand type (except string)
to the assembler variable. These assignments do *NOT* result
in object code generation - they simply redefine a symbol to
have different meanings to CHASM in different parts of your
program. For example:
x = [80]
mov ax, x ;means mov ax, [80]
x = bx
mov ax, x ;x redefined, means mov ax, bx
x = 100H
mov ax, x ;redefined again, means mov ax, 100H
38
>>Macros<<
==>Advanced version only.
Macros are an advanced feature. Beginners may wish to skip this
section until they become more experienced with CHASM and
assembly language programming.
A. Introduction
Macros are a shorthand way of writing frequently used
sections of code. Using macros, you can write a code fragment
once, then any time you want to use it, just reference it by
name. Once you define a macro and give it a name, anywhere
CHASM sees the name will be automatically "expanded" into the
previously defined code.
For example, suppose you were writing a large program, and at
the beginning of each subroutine you pushed all the general
purpose registers onto the stack to save their contents.
Every subroutine would start out something like this:
push ax ;save register contents
push bx
push cx
push dx
Before long, you'd get pretty sick of writing the same thing
for each subroutine. Macros are a way to put some of the
boring, repetitive nature of assembly language into the hands
of the assembler, freeing you up for the more creative
aspects. Here's how you define a CHASM macro to take the
place of this code fragment:
savestate macro
push ax ;save general purpose registers
push bx
push cx
push dx
endm
39
Note the MACRO and ENDM statements. These signal to CHASM
that the enclosed code is a macro definition to be saved for
later use, rather than code to be assembled at this point in
your program. The MACRO statement also gives a name
(SAVESTATE) to the macro, which will take the place of the
stored code.
Now, at the beginning of each subroutine you can just write
"SAVESTATE" when you want to push the general purpose
registers. Here's an example:
get_input proc near
savestate
...
...
endp
Given the previous macro definition, the above example is
*exactly* equivalent to:
get_input proc near
push ax
push bx
push cx
push dx
...
...
endp
The only difference is that less busy work is required on your
part.
Macros are NOT subroutines! Subroutines are coded once, then
called within your program at run time. Macros are expanded
in-line at assembly time, and the code is inserted into your
program at the invocation point. If you invoke the macro 20
times in your program, you'll end up with 20 copies of the
macro code.
40
Although this does waste some memory space, you save execution
time by eliminating the subroutine call and return process.
At a minimum, it takes 32 machine cycles to call a subroutine.
The four instruction macro given above requires 40 cycles to
execute. If it were coded as a subroutine, the time to
execute it would almost *double*. Macros trade off space for
speed.
B. Macro Parameters
CHASM's macros allow 9 user defined parameters, which are
evaluated at expansion time. The parameters work just like
those in DOS batch files. Here's an example of a macro with
one parameter. PRINT calls DOS function 9 to print a string
on the console. A parameter is used to specify the name of
the string to be printed:
print macro
mov ah, 09H ;specify print string function
mov dx, offset(%1) ;point to string
int 21H ;call DOS
endm
Given this definition, when CHASM sees a line like:
print title
The following code gets inserted in its place:
mov ah, 09H
mov dx, offset(title)
int 21H
Note how the "%1" in the macro definition got replaced by
"title" which was put as an operand on the invocation of the
macro. You can put up to 9 operands onto a macro invocation,
and they will be substituted for the dummy parameters %1
through %9.
41
If you put a label on a macro invocation, two things happen.
As usual, the label represents a program location for
branching, with offset equal to the beginning of the expanded
macro. You can branch into the expanded macro by using the
label as an operand on a jump or call instruction. In
addition, if the special dummy parameter "%0" is used in the
macro definition, it is replaced with the label text when the
macro is expanded. This feature is provided to facilitate
writing loops within macros.
If you fail to provide an operand for any dummy parameter used
in the macro definition, CHASM substitutes a null string of
length zero. To leave one parameter blank, but provide
replacements for parameters with higher numbers, use the
special operand "%B" for the one you want blank. For example:
def_mem macro
%1 db %2, %3
endm
def_mem twobytes, 128, FFH
def_mem %B, 'Hit any key when ready...'
Expands to:
twobytes db 128, FFH
db 'Hit any key when ready...'
C. Internal Labels
A potential problem exists if you put a label on a statement
within a macro. The first time the macro gets invoked,
everything works fine. However, remember that a given label
can only be used once. The second time you invoke the macro,
the line with the label will get a "Duplicate definition"
error message.
CHASM offers "internal labels" for use within macros. When
expanding macros, CHASM replaces these internal labels with a
unique text, different for each invocation of the macro.
42
Macro internal labels are of the form:
%Lx
The "%L" signals CHASM that you want an internal label. The
"x" can be any character that isn't a delimiter. By using
different characters, you can define a many different labels
in a each macro.
Within the macro, you use the internal label symbol just like
a normal label. For example:
INTLAB MACRO
%LA MOV AX, DX
JMPS %LA
ENDM
Each time this macro is invoked, all occurrences of "%LA" will
be replaced with a different text. (The replacement text will
be of the form "%LAnnnn" where "nnnn" is a number.)
Using the symbol table dump on your listing, you can figure
out what text CHASM used in any given invocation. DON'T try
to use this information to branch into the macro invocation
from the outside. Editing your source file can cause CHASM to
use a different substitution text the next time you assemble.
Internal labels are intended for macro INTERNAL use only.
D. Macro Nesting
Macro invocations can be nested, up to 10 deep. Invocations
are maintained on a stack, and each invocation has it's own
set of parameters and internal labels.
CHASM does not check for recursive macro calls. If you call a
macro from within itself, it's quite possible to get caught in
an endless loop. This can also happen indirectly, where a
macro invokes another macro, which ends up calling the first.
If you get caught, you can escape by hitting Ctrl-Break.
Experienced programmers can use macro recursion along with
expressions and conditional assembly to create some really
elegant macros. However, this is definitely not for beginners
or the faint of heart. Enter at your own risk.
43
E. Conditional Macro Expansion
This advanced feature allows you to write general purpose
macros which expand in different ways based on the parameters
used on the invocation.
For example, suppose you wanted to write a macro to do
operating system calls. Many of the DOS functions just load
the function number into AH, then call DOS with an interrupt.
Here's a macro to do this:
doscall macro
mov ah, %1
int 21H
endm
Given the above definition, "DOSCALL 1" will expand into a
call to DOS function #1, keyboard input.
Unfortunately, not all DOS calls are quite so simple. For
example, the call for printing a string to the console
requires that the offset of the string be loaded into DX.
Different system calls are going to require somewhat different
handling. One approach would be to write separate macros for
each function. However, with 87 different functions in DOS 2,
this starts to get unwieldy.
A more general approach is to incorporate some "intelligence"
into the macro, and let it expand differently for different
DOS functions. Here's a slightly more general macro, which
loads DX whenever the "print string" function is requested:
doscall macro
mov ah, %1
ife %1 9
mov dx, offset(%2)
endif
int 21H
endm
44
"IFE" is short for "If Equal". The "IFE" and "ENDIF" bracket
a line of macro code which will be included during expansion
only if the "IFE" statement is satisfied. In this example,
the MOV DX statement will be included only if the first
parameter is equal to 9. The enclosed line is indented in
this example to help show the structure of the macro, but
indentation is not required.
Given the above definition, "DOSCALL 1" expands to only two
lines:
mov ah, 1
int 21H
However, "DOSCALL 9, STRING" expands to three lines:
mov ah, 9
mov dx, offset(string)
int 21H
CHASM supports twelve different conditional test statements.
They are:
IFE "if equal"
IFNE "if not equal"
IFGT "if greater than"
IFGE "if greater than or equal"
IFLE "if less than or equal"
IFLT "if less than"
IFB "if blank"
IFNB "if not blank"
IFREG "if a register"
IFREG8 "if an 8 bit register"
IFREG16 "if a 16 bit register"
IFSEGREG "if a segment register"
Most of the conditionals take two operands, and perform a
comparison. The operands are evaluated, and their values are
compared according to the condition being tested. Strings are
compared in the normal alphabetical order sense. Examples:
IFE 15, 0FH ;is true
IFGT 'ABCD' 'EFGH' ;is false
IFLE 20H, 128 ;is true
45
IFB and IFNB require only one operand, which should be a dummy
parameter. These conditionals test whether the parameter was
left "blank" or if a replacement was provided on the
invocation.
The IFREG conditionals take one operand. They return true if
the operand is a register in the following sets:
IFREG: any register
IFREG8: AH, AL, BH, BL, CH, CL, DH, DL
IFREG16: AX, BX, CX, DX, DI, SI, BP, SP
IFSEGREG: CS, DS, ES, SS
You can place as many lines as you like between an IF
statement and the ENDIF. All will be included if the
condition tested is true, and none will be included if it's
false.
An optional "ELSE" construct is also provided. Statements
after the ELSE get included only if the tested condition is
false. CHASM's conditional assembly syntax can be summarized
as follows:
IF..... ;if statement with parameter(s)
s1 ;statements included if true
s2 ; " " " "
ELSE ;end of "true" option, begin "false"
s3 ;statements included if false
s4 ; " " " "
ENDIF ;end of conditional assembly
IF statements can be nested, up to 10 deep. During nesting,
CHASM assumes that ELSEs are paired with the latest unfinished
IF statement. For example:
example macro
ife %1, 1
db 'Outer IF is true'
ifb %2
'Inner IF is true'
else
'Where do I belong?'
endif
endif
46
"EXAMPLE 1, 5" expands to:
db 'Outer IF is true'
db 'Where do I belong?'
But "EXAMPLE 2" produces no expansion at all.
This is easiest to follow if you use "structured" indenting
when writing the macro, as in the example above. Remember,
however, that CHASM ignores indenting and follows the
"latest unfinished" rule in figuring out which IF gets the
ELSE. Don't fool yourself with improper indentation.
F. Final Notes:
The conditional assembly pseudo-ops are not restricted to use
only in macros, although there aren't that many useful
non-macro applications.
CHASM won't recursively expand parameters. For example:
RECUR MACRO
XOR %1, %2
ENDM
RECUR AX, %1
expands to:
XOR AX, %1 ;actual expansion
NOT to:
XOR AX, AX ;won't happen
Like equates and structures, macro definitions should all be
placed at the beginning of your programs, before the macro
gets invoked. If you invoke a macro before it's defined, a
phase error will occur.
47
Macro definitions are stored and expanded blindly by CHASM,
with no syntax checking. Dummy parameter symbols can appear
in any of the input line fields. Any errors in a macro
definition will only become evident when CHASM expands the
macro and attempts to assemble the result.
In keeping with the "shorthand" philosophy of macros, the
results of macro expansion don't normally appear on the
listing. If you want to see the expansion for debugging
purposes, insert a LIST pseudo-op as the first line of the
macro definition.
Once you start writing macros, you might find it useful to
gather them together into a single file. You can then pull
this file into the beginning of all your programs with an
INCLUDE pseudo-op. Macros which don't get invoked won't add
anything to the object code produced, and you'll save yourself
the trouble of typing in the ones you *do* need. If you put a
NOLIST at the beginning and a LIST at the end of your INCLUDE
file, you don't even have to look at the definitions on your
listings.
48
>>Structures<<
==>Advanced version only.
Structures are an advanced feature. Beginners may wish to skip
this section until they become more experienced with CHASM and
assembly language programming.
CHASM's structure capability allows you to generate a "template"
with which to organize repetitive data structures. As an example
of a repetitive data structure, consider a phone list. Each
entry in the list has three components:
Name - 20 characters
Address - 50 characters
Phone - 10 characters
Each entry thus uses a total of 80 bytes of storage. To declare
a list with 500 entries, you would declare 80 x 500 = 4000 bytes:
PHONELIST DS 4000 ;500 entries @ 80 bytes/entry
That's easy enough, but now what's the offset of the 346th
address field? This is starting to get confusing, and time
consuming to figure out.
Furthermore, hard-coding numbers (like that 4000, above) into a
program is never a good idea. What's going to happen when you
decide to add a zip code field, or make more room in the name
field because you just met Alexandria Zbrievskivich? You'd have
to go through and change by hand each number which depended on
the actual layout of your data. Murphy's law guarantees that
it'll take somewhere between "several" and "too many" assemblies
to find them all.
Structures allow you to set up a symbolic template which makes it
much easier to manage structured data like this phone list. By
using symbols defined in the structure, rather than bare numbers,
a change in data structure doesn't mean a frantic search
throughout your entire program to make corrections. If you
change the structure definition, the symbols take on new values
automatically.
49
Here's what the phone list structure looks like:
LISTENTRY STRUC
NAME DS 20
ADDRESS DS 50
PHONE DS 10
ENDSTRUC
Note the STRUC and ENDSTRUC statements. They mark the beginning
and end of the structure, and give it a name (LISTENTRY).
Within the structure are storage defining pseudo-ops. DS is used
in this example, but DB and DW could also be used. ORG can be
used within a structure, but any 8088 instruction will result in
an diagnostic message and termination of the structure
definition.
Inside a structure, the storage defining pseudo-ops
behave somewhat differently than normal. No actual storage is
set aside, but CHASM keeps track of how much space would normally
be declared.
Labels on pseudo-ops get assigned values equal to their offset
within the *structure*, not within the program as a whole. Also,
CHASM will treat the labels as immediate operands, rather than
memory locations. The result of the structure given above is to
generate three immediate operands, with the following values:
NAME = 0
ADDRESS = 20
PHONE = 70
CHASM does one other piece of useful book-keeping during
structure definitions. If a label appears on the STRUC
pseudo-op, it gets treated as an immediate operand, whose value
is equal to the total length of the structure.
You can either use the STRUC label directly as an operand, or if
you like "pretty" code, use the LENGTH operator on it. In this
example, both LISTENTRY and LENGTH(LISTENTRY) are immediate
operands of value 80.
(Inside note: LENGTH is a null operator, provided mainly for
aesthetic reasons. Using LENGTH will often make your code more
readable, but is equivalent to using just the label itself.)
50
Like equates and macros, structures should be placed at the
beginning of your programs before any machine instructions,
otherwise phase errors can occur. If you *do* embed structures
inside your program, you can eliminate any phase errors by using
the LENGTH function to reference any of the immediate operands
generated by the embedded structure.
The immediate operands generated during a structure definition
can be very useful in writing your program. Following the phone
list example, here's a better way to declare storage for the
list:
NUMENTRYS EQU 500
PHONELIST DS LISTENTRY*NUMENTRYS ;500 entries of length
;defined by the structure.
Now if you add another field to the structure, PHONELIST will
automatically increase in size.
The 8088's indirect addressing modes, coupled with structures,
make a very powerful combination for accessing structured data in
memory. Suppose AX contains the number of the entry you want to
work on. You can calculate the address of the entry as follows:
MOV BX, LENGTH(LISTENTRY) ;length per entry
MUL AX, BX ;times entry number
ADD AX, OFFSET(PHONELIST) ;plus the starting offset
MOV BX, AX ;BX <== frame pointer
BX is now a "frame pointer" - it points to the beginning of the
desired entry. You can access the various parts of the entry
using the optional displacement field in the indirect address.
For example, here's how you would store the letter 'A' into the
first byte of the address field:
MOV ADDRESS[BX], 'A'
As another example, the following line reads the third letter of
the name into the AL register:
MOV AL, NAME+3[BX]
51
The first entry in a structure almost invariably has value 0.
CHASM "optimizes" indirect addresses with structure-generated
constants equal to zero, generating the no-offset form of the
address. In this example:
MOV AX, NAME[BX]
actually assembles as MOV AX, [BX] rather than MOV AX, 0[BX]
The resulting code is more compact, and runs faster.
The more complicated indirect modes can be used to scan through
or point within the fields. The following program fragment gets
the fourth digit in the phone number:
MOV DI, 4 ;specify 4th digit
MOV AL, PHONE[BX+DI] ;read it into AL
Often times, you'll want to process entries sequentially. To
move to the next entry, you just add the length of the entry to
the frame pointer:
ADD BX, LENGTH(LISTENTRY) ;point to next entry
The preceding examples should give you a feel for what you can
do with structures, but do not exhaust all the possibilities.
Experiment with this feature, and many of your programs will be
both more readable and more easily modified.
52
>>8087 Support<<
===> Advanced version only.
In addition to the normal 8088 instructions, CHASM's Advanced
version supports all the 8087 mnemonics. Look in the appendices
for a list available instructions.
Several books describing the functions of the 8087 instructions
are available. CHASM's 8087 support is based on Appendix D of
Intel's 8087 Support Library, and the book "Assembly Language
Programming for the IBM Personal Computer" by David J. Bradley
(Prentice-Hall, 1984).
ALL the 8087 instructions that reference memory for a real or
integer quantity are ambiguous as to the size of operand.
Integers can be Word (2 bytes), Short (4 bytes) , or Long (8
bytes). Reals can be Short (4 bytes), Long (8 bytes) or
Temporary Real (10 bytes).
As with 8088 memory references, CHASM resolves ambiguities using
a suffix. The following suffixes can be added to mnemonics to
specify operand size:
W: word
S: short
L: long
T: temporary real
Here are some examples of using suffixes:
FADD [200H] ;wrong! ambiguous memory reference
FADDT [200H] ;add 10 byte temporary real
FILD CS:[DI] ;wrong! ambiguous memory reference
FILDW CS:[DI] ;load 2 byte integer
CHASM automatically generates a WAIT instructions prior to most
8087 instructions. The only exceptions are the "No Wait"
instructions with a "N" as the mnemonic's second letter.
Examples:
FYL2X ;automatically preceded by WAIT
FNCLEX ;no wait form of FCLEX
53
If you'd prefer to manually add WAITs only where needed to
synchronize critical instructions, the WAITOFF pseudo-op disables
CHASM's automatic WAIT assembly. Using fewer WAITs allows the
8088 and 8087 to run in parallel more often, giving better
performance. Make sure you synchronize with a WAIT before
allowing the 8088 to access a location being modified by the
8087. You can turn CHASM's automatic WAIT assembly back on using
a WAITON pseudo-op.
You must use the operand modifier form for segment overrides on
8087 instructions. A separate SEG instruction will modify
CHASM's automatically generated WAIT instruction, and won't
affect the intended 8087 instruction. Example:
SEG CS
FLDS [100H] ;wrong!
FLDS CS:[100H] ;correct
54
>>Outside the Program Segment<<
As mentioned previously, CHASM does not support multiple segment
definitions. Provision is made for limited access outside of the
program segment, however.
A. Memory References:
To access memory outside the program segment, you move a new
segment address into the DS register, then address using
offsets in the new segment. The memory option of the EQU
pseudo-op allows you to give a variable name to offsets in
other segments. For example, to access DOS's equipment flag:
BIOS_DATA EQU 40H
EQUIP_FLAG EQU [0010H]
MOV AX,BIOS_DATA ;can't move immed. to DS
MOV DS,AX
MOV AX,EQUIP_FLAG ;get bios equipment flag
B. Code Branching:
CHASM supports 4 instructions for branching outside the
program segment.
1. Direct CALL and JMP
New values for the PC and CS registers are included in the
instruction as two immediate operands. Example:
BIOS EQU 0F000H ;RAM bios segment
DISKETTE_IO EQU 0EC59H ;disk handler
JMP DISKETTE_IO,BIOS
2. Indirect CALLF and JMPF
Four consecutive bytes in memory are initialized with new
values for the PC and CS registers. The CALLF or JMPF then
references the address of the new values. Example:
BIOS EQU 0F000H ;RAM bios segment
PRINTER_IO EQU 0EFD2H ;printer routine
MOV [DI],PRINTER_IO
MOV 2[DI],BIOS
CALLF [DI]
55
>>Running CHASM<<
A. Prompt Mode
From DOS, type:
CHASM
If you're using the Subset version, a hello screen is printed,
followed by the message:
Hit Esc to exit, anything else to continue...
Advanced CHASM skips the commercial message.
You're now presented with a series of prompts:
Source code file name? [.asm]
Type in the name of the file which contains your program. If
you don't include an extension for the filename, CHASM
assumes it to be .ASM. If CHASM is unable to find the file,
it will give you the option of naming another file, or
returning to DOS. Note that anywhere CHASM expects a
filename, you can optionally include a drive and/or path.
Assuming your file is present, CHASM prompts:
Direct listing to Printer (P), Screen (S), or Disk (D)?
[nul:]
Respond with either an upper or lower case letter. If you
just press enter, no listing will be produced. If you select
"D", CHASM will prompt:
Name for listing file? [fname.lst]
Type in a name for the listing file. If you just press ENTER,
the name defaults to that of your source file, with an
extension of .LST.
56
Listing Notes:
1. The setting of the /PATH configuration switch controls
the exact form of the default list file name.
2. Regardless of where the listing is sent, error messages
are always echoed to the screen.
3. Suppressing the listing will result in faster assembly.
The final prompt is:
Name for object file? [fname.com]
Type in a name for the assembled program. If you just press
ENTER, the name defaults to that of your source file, with an
extension of .COM.
Note: The setting of the /PATH configuration switch controls
the exact form of the default object file name.
CHASM now assembles your program. A status line is maintained
on the screen, showing how many lines have been processed,
along with how many errors have been discovered. CHASM makes
two passes over your source file, outputting the listing and
object code on the second pass. You can pause assembly at any
time by hitting Cntl-S (or just S). Hitting any key then
resumes assembly. You may abort assembly and return to DOS at
any time by hitting Esc, Ctrl-C or Ctrl-Break.
At the end of the second pass, a final summary of the assembly
process is printed. It will look something like:
0 Error(s) detected
0 Diagnostic(s) offered
954 (3BAH) Bytes of object code generated
This information should be self-explanatory. The number of
bytes is given in both decimal and hex format.
57
If labels appeared in your program, a dump of the symbol table
will follow. This lists each user-defined symbol, along with
its value (in hex). The symbols are printed in alphabetical
order. Each value is preceded by a one-letter code, which
tells the symbol's type:
P: a program location
M: a memory location for data
I: immediate data
Upon exit, CHASM sets the system variable ERRORLEVEL to
(surprise!) the total number of errors discovered in your
source file. If you run CHASM from a batch file, you can use
this feature to automatically invoke your text editor if
errors were discovered.
B. Expert Mode:
This mode allows you to specify all i/o information on the
command line which invokes CHASM. The syntax is:
CHASM sourcefile [p|s|d|/] [listfile|/] [objectfile]
Items within brackets ([]) are optional. You may select *one*
of any list of items separated by a bar (|).
Basically, you just include on the command line all your
responses to the normal prompts. Each response must be
separated from the others by either a space or comma.
If you don't specify the list device/file or the object file,
they default to NUL: and sourcename.COM respectively. To
represent a carriage return (to specify a default choice, but
allow modifying a later response) use the character slash (/).
58
Expert mode examples:
1. Source file is EXAMPLE.ASM, no listing, object file
EXAMPLE.COM:
CHASM example
2. Source file is SDIR.ASM, list to printer, object file
SDIR.COM:
CHASM sdir p
3. Source file is MYFILE.PRG, list to disk file MYFILE.LST,
object file SUBR.COM:
CHASM myfile.prg d / subr.com
59
>>Error and Diagnostic Messages<<
Error messages generated on pass one appear on the listing before
any source code is printed, and mention the line number to which
they refer. The majority of messages occur during pass two, and
will appear in the listing immediately prior to the line which
caused the message. Unless the listing itself is going to the
screen, messages and the source line which generated them will be
echoed there.
Add Leading Zero to Hex Constant: Diagnostic. The unknown
symbol could be interpreted as a hexadecimal number if a
leading zero was added.
CHASM Internal Error: XX PC: YYYY or
CHASM I/O Error: XX PC: YYYY
Sigh. You just discovered a problem in CHASM itself. Please
contact Whitman Software, following the procedure in the
appendix on Bug Reporting.
Could Use JMPS: Diagnostic. The specified label requires an
offset of less than 128 bytes; specifying the short jump would
result in more compact code. The assembled code is correct,
however.
Conditional Nested Too Deeply: IF statements can only be nested
10 deep.
Data too Large: You are attempting to use a word of immediate
data where only a byte is allowed.
DM out of range: The product of the DM's operands is either
negative, or greater than 32767.
Duplicate Definition of XXX in (linenum): Pass 1 error. An
attempt was made to define a symbol already present in the
symbol table.
ELSE without IF: An ELSE was encountered, but no corresponding
conditional pseudo-op was found.
ENDIF without IF: An ENDIF was encountered, but no corresponding
conditional pseudo-op was found.
ENDM without MACRO: An ENDM was encountered, but no
corresponding MACRO was found.
60
ENDP without PROC: An ENDP was encountered, but no corresponding
PROC was found.
ENDSTRUC without STRUC: An ENDSTRUC was encountered, but no
corresponding STRUC was found.
EQU Without Label: No symbol was found to equate with the
operand.
File not found: XXX in (linenum). Pass one error. CHASM was
unable to find the file XXX, specified in the INCLUDE
pseudo-op.
Heap Full: Too many XXX. Usually a Pass one error.
You've run out of memory for the symbol and macro tables. You
shouldn't see this message unless you have only 128K and are
assembling a very large program.
Illegal Label: XXX in (linenum). Pass one error. The symbol
XXX begins in column one and has as its first character a
number, or a plus or minus sign. Alternatively, you tried to
use a reserved word or symbol as a label.
Illegal Operation for Structure - ENDSTRUC Implied: Diagnostic.
The current line is within a structure, and is not a storage
defining pseudo-op. CHASM generates an ENDSTRUC, which
terminates the structure definition, then assembles the line
normally.
Illegal or Undefined Argument for LENGTH: The argument for the
LENGTH function was not present in the symbol table as an
immediate operand on pass 2.
Illegal or Undefined Argument for OFFSET: The argument for the
OFFSET function was not present in the symbol table as a near
label or memory location on pass 2.
Missing ENDM: The end of the input file was encountered, and at
least one MACRO had not been terminated by an ENDM.
Missing ENDP: The end of the input file was encountered, and at
least one PROC had not been terminated by an ENDP.
Missing ENDSTRUC: The end of the input file was encountered, and
at least one STRUC had not been terminated by an ENDSTRUC.
61
Multiple Segment Overrides are Illegal: Diagnostic. You have
specified more than one segment override on this instruction.
The first override is used, and the other(s) ignored.
Nested INCLUDE: An INCLUDE was encountered in an INCLUDEd file.
The INCLUDE pseudo-op cannot be nested.
Nested Structure: A STRUC was encountered inside a structure.
Structures cannot be nested.
No Name For Macro: Diagnostic. The macro statement did not have
a label. CHASM is unable to give a name to the macro, and you
will be unable to reference it.
Operands Not Compatible: The size of the two operands does not
match.
Phase Error: A label or memory location is found to have
different values on pass 1 and pass 2. A difficult to debug
error: generally the problem is not caused by the statement
which received the error message. The problem is caused by an
improper statement before this one, but after any other labels
(otherwise *they* would have received the error message).
When phase errors are discovered, CHASM prints this message,
then resynchronizes the location counter to match the offset
calculated on pass one. If further phase errors are reported,
the line responsable for each subsequent error will be located
between two Phase Error messages, but after any unflagged
labels.
There are four documented ways to generate phase errors.
1. A previous instruction used a symbolic immediate operand
prior to the symbol's definition.
2. A previous instruction made improper use of a forward
referenced label, either an attempt to branch into a data
area, or to access a code area as if it was data.
3. The label on the flagged statement is defined more than
once in the program.
4. A previous instruction invoked a macro prior to its
definition.
62
Whitman Software would appreciate hearing about any other
situations which cause the Phase Error message to appear.
Parameter Too Large for Expansion: Diagnostic. Replacement of the
dummy parameter would cause the macro line to exceed CHASM's
internal 255 character limit for manipulating strings.
Generally this message will be accompanied by the "Source Line
Truncated" message, warning that a line has exceeded the
allowed 80 columns.
Procedures Nested Too Deeply: Procedures may be
nested no more than 10 deep.
Source Line Truncated: The length of the input line exceeded 80
characters.
Specify Word or Byte Operation: Diagnostic. CHASM suggests that
the Syntax Error might be resolved by adding the suffix "B" or
"W" to the instruction mnemonic. Most, but not all, ambiguous
memory references are flagged with this diagnostic.
Syntax Error: (OP) (DEST), (SOURCE). CHASM was unable to find a
version of the instruction (OP) which allows the operands
(DEST) and (SOURCE). Either the instruction doesn't exist, or
it is an inappropriate choice for the given operands. The (OP)
(Dest), (Source) is a reconstruction of your source line based
on how CHASM parsed it. A comparison of the reconstruction and
your original source code will sometimes help pinpoint the
error.
Syntax Error messages are followed by two diagnostics which
spell out in words CHASM's best guess about the operands.
Again, a comparison between CHASM's guesses and what you
really meant can help find the problem.
Too Far For Short Jump: The displacement to the specified label
is not in the range -128 to +127.
Undefined Operand for EQU: Any operands on an EQU statement must
have been previously defined.
Undefined Symbol XXX: The symbol XXX was used as an operand, but
never appeared as a label, and is not a predefined symbol.
63
Unrecognized Operand XXX: XXX is used in the DB or DW operand
list, but is not a valid immediate operand. (or string, in the
case of DB).
64
>>Execution of Assembled Programs<<
A. Object code format
The object code file produced by CHASM is in the form of a
memory image, exactly as will be present in your computer at
run time. No link step is required. Provided that the segment
registers are set correctly, the architecture of the 8088
guarantees that code is self-relocating, and will run correctly
loaded anywhere in memory. Storing a program as an exact image
of memory at run time is called the COM format by IBM.
This COM format is *not* that produced by the IBM assembler.
The output of the IBM assembler is in the form of an "object
module" suitable for input to the linker. The object module
is not directly executable, but must first be "filtered"
through the linker. This adds an extra step to the process of
producing a working program, but gives you the option of
combining multiple object modules into one program. The
resulting linked program is *still* not a memory image, but
has a header which is used to perform relocation during
loading. This linked program plus header is called the EXE
format by IBM.
B. Running Assembled Programs From DOS
DOS provides a loader for running machine language programs.
To run a program, you merely type its name, without the
extension. This is what you're doing every time you use a DOS
external command such as FORMAT or CHKDSK. In fact, the COM
format is named after "external COMmand".
When DOS loads a program, it examines the file extension to
determine what format the file is in, either COM or EXE. This
is why CHASM defaults to using the extension .COM for your
object file. If you plan to run the program from DOS, don't
change the extension.
For COM programs, DOS builds a 255 byte long "program segment
prefix" and sets the segment registers to point to this PSP.
The contents of the file are then loaded verbatim right after
the PSP, at offset hex 100 in the segment defined by the
segment registers. As soon as loading is complete, your
program is executed starting with the instruction at hex 100.
65
Although you can totally ignore the PSP, you should read pages
E-3 through E-11 of the DOS manual to see what DOS puts there
for you. It turns out there are some real goodies which your
program might want to use.
When your program is done, it must transfer control back to
DOS, otherwise the 8088 will continue to fetch what it
believes are instructions from whatever garbage or bit-hash
happens to follow your program in memory. The easiest way to
return to DOS is to execute the instruction:
INT 20H
This is the vectored interrupt reserved by DOS for program
termination.
While we're on the topic of vectored interrupts, you would be
well rewarded to study both the DOS Technical Reference and
Hardware Technical Reference Manuals to find out what happens
when you execute some of the other interrupts. Some very
useful functions, such as file handling and screen i/o, are
available at the machine language level through this
mechanism. Information on interrupts is also available in
Peter Norton's book "Programmer's Guide to the IBM PC", which
is cheaper than buying both of IBM's reference manuals, and
also more readable.
Looking at things the other way, by changing the interrupt
vector for a given function to point to your own code, you can
override the way DOS or the BIOS does something, and do it
your way. DOS even provides a method (via interrupt 27H) by
which your new code can be grafted onto DOS, and not be
overwritten by other programs.
C. Debugging Assembled Programs
IBM provides an excellent utility with DOS, called DEBUG.COM.
By specifying your program's name as a parameter when invoking
DEBUG, you can observe your program execute with DEBUG's trace
and other functions. To debug your program, from DOS type:
DEBUG progname.COM
66
DEBUG builds a PSP and loads your program just like DOS does,
but you have the added power of the debugging commands to
monitor your program while it runs. See chapter 6 of the DOS
manual for more details about using DEBUG.
On the topic of debugging, I can recommend most highly a
program called TRACE86, from Morgan Computing (10400 N.
Central Expressway, Suite 210, Dallas, TX 75231). The program
replaces DEBUG, and although rather steeply priced, makes the
IBM debugger look silly. I've been using TRACE86 for some
time now, and wouldn't be without it.
D. Using Assembled Programs in BASIC
To incorporate a machine language subroutine in a BASIC
program, write it in assembly language, then assemble it with
CHASM. You should read page C-7 of the BASIC manual for some
conventions to use in writing your subroutine. In particular,
note that you must declare the routine to CHASM as a FAR
procedure using the PROC pseudo-op, and that the last
instruction of the routine should be a RET.
Unlike programs which are run directly from DOS, your routine
will not be preceded by a program segment prefix. You should
prevent CHASM from leaving room for a PSP by putting an ORG 0
pseudo-op at the beginning of your routine. If you don't
include the ORG, memory references will not be assembled
correctly. Example:
ORG 0 ;no psp
SUBR PROC FAR ;far procedure
... ;body of subroutine
RET
ENDP
CHASM supports two methods for getting assembled routines into
BASIC programs. The methods differ in whether the routine is
included in the BASIC program file, or in a separate file.
A utility program called COM2DATA is provided for including
machine language within BASIC program files. The program is
distributed in source code form (file COM2DATA.ASM) and must
be assembled with CHASM prior to use. The program functions
as a DOS 2 filter, reading a COM file in from the standard
input, and writing a series of BASIC DATA statements to the
standard output.
67
COM2DATA's syntax is as follows:
COM2DATA [<infile] [>outfile] [linenum]
You specify the input and output files as with any DOS 2
filter. The linenum parameter sets the starting line number
used on the BASIC code produced. If you don't specify
linenum, it defaults to 1000.
If you MERGE the file of DATA statements into your BASIC
program, the program can then READ the data and POKE it into
memory. An example program to do this is given on page C-6 of
the BASIC manual. An alternative approach would be to store
the routine in a string variable, which could later be located
with the VARPTR function.
If you would prefer to keep your machine language subroutine
in a separate file, include a BSAVE pseudo-op somewhere within
your assembly language source code. CHASM will build a header
on the object code produced, which will mimic that built by
BASIC's BSAVE command. The resulting file may be BLOADed by
BASIC to any location in memory.
You transfer control to your routine with either the USR
function, or the CALL statement. Syntax for these statements
can be found in the BASIC manual.
68
E. Using Assembled Programs with Turbo Pascal:
CHASM and Turbo Pascal work splendidly together, complementing
each other's strong points. You can use CHASM to provide new
functions you wish Turbo had, or to fine tune a critical
procedure for optimum speed. CHASM itself is written in a
combination of Turbo Pascal and CHASM.
CHASM supports two techniques for producing machine language
code for Turbo Pascal: external procedures or functions, and
Turbo INLINE code.
1. External Procedures and Functions:
Turbo loads external procedures and functions within the same
segment as the rest of your Pascal program. You have no
control of the exact load location (more on this later), but
on the other hand, you don't have to worry about setting aside
a special location for your procedures (as in BASIC). Since
your external procedure is loaded in the same segment as the
Pascal code, it should be declared NEAR to CHASM:
EXTERNAL PROC NEAR
... ;body of procedure
...
ENDP
Turbo passes parameters to your procedure/function via the
stack. To work effectively, a good grasp of the stack
structure is critical. Read the Turbo manual for information
on internal data formats and parameter passing, to see just
what to expect on the stack. Also, remember that the stack
grows down from the top of memory.
If you're going to access the stack in your procedure, the
first thing you should do is set up BP as a stack pointer.
Since Turbo also uses BP, you have to save the current value
first. The obvious place to save it is on the stack...
PUSH BP ;save old BP
MOV BP, SP ;and set up to indirectly address stack
69
You can now access the parameters on the stack using offsets
off the BP register. Note that since you PUSHed BP,
everything is 2 bytes deeper onto the stack than what Turbo
originally sent you.
Here's an example. Suppose you declare the following external
function in Turbo:
function Sum(x,y: int): int;
external 'sum.com';
After you've pushed BP, here's how the stack looks:
stack contents indirect address
----------------------------------------------
<value of parameter y> 6[BP]
<value of parameter x> 4[BP]
<return address to Turbo> 2[BP]
<old BP value> [BP]
The indirect addresses go up two at a time, since each item on
the stack is a word (two bytes) long. You can access the
parameters using their indirect addresses. Here's the code for
an external function to add two integer parameters:
SUM PROC NEAR
PUSH BP ;save old BP
MOV BP, SP ;set up stack pointer
MOV AX, 4[BP] ;get parameter x
MOV ADD, 6[BP] ;add parameter y
;leave sum in AX to return to Turbo
POP BP ;restore old BP for Turbo
RET 4 ;clear params off stack
ENDP
A more elegant way to access the parameters is by using a
CHASM structure to define their offsets on the stack:
STACK STRUC
OLDBP DW 0000H
RETADDR DW 0000H
XPARAM DW 0000H
YPARAM DW 0000H
ENDSTRUC
70
With this structure added to the above example, you could
access the parameters like this:
MOV AX, XPARAM[BP] ;get parameter X
Functions return scalar results by having the value in AX upon
return. The function in the above example saves some time by
calculating the value in AX in the first place. Upon exit,
the function POPs BP, to restore the value Turbo was using.
Note the RET 4 in the example. This returns to Turbo, while
simultaneously POPing (and discarding) 4 bytes off the stack.
This clears off the two parameters which Turbo passed. If
there were three parameters, you'd use a RET 6; if none, a
simple RET would do. When Turbo receives control, it assumes
that you've cleaned up the stack by removing all parameters.
If you don't do this properly, a run-time error, or even a
system crash will result.
(Typical scenario: Turbo tries to return from one of its own
subroutines, thinking the top of the stack has the return
address. Unfortunately, the top is really an integer
parameter, left over from your external function. Turbo's
RET sends control into some random area of memory, and boom -
the system crashes.)
It's easy to get confused about the exact contents of the
stack. If your procedure doesn't seem to be working right,
the first thing to suspect is Turbo and you have different
ideas about what's where on the stack. A DEBUG session can
usually straighten things out.
Boolean functions constitute a special case which is poorly
documented in the Turbo Pascal manual. Boolean functions must
return their result in two ways:
1. by setting the zero flag (Z = false, NZ = true)
2. and by returning either 0 (false) or 1 (true) in AX
The first return method is assumed by Turbo if you use the
function in a conditional statement, the second if you assign
the value of the function to a variable. You need to return
the result both ways to cover all possible uses of your
function.
71
When external functions are called, in addition to the normal
parameters, Turbo passes something called the "function
result" on the stack. When the result is a scalar type, this
seems to be intended just as a local work area for you to use,
since Turbo ignores any value you store there. Like a
parameter, the function result must be POPed off the stack
when your return to Turbo. Unlike scalar parameters (which
always occupy a word of stack memory, even if they are defined
as byte), the function result is the exact length of the
result type. Thus for a boolean function, you have to POP off
one extra byte above and beyond those for clearing off the
parameters.
A problem of addressability can crop up if your external
procedure tries to maintain its own local variables and/or
constants. The problem is that you have no way of knowing
just where Turbo is going to load your procedure within the
shared segment. As such, the address CHASM calculates for any
memory locations are going to be offset from their real values
by some unknown constant, the offset of the procedure within
the shared segment. This is called a relocation problem.
Fortunately, there's a way around this problem, but it
requires using a trick. Your program has to figure out, *at
run time*, just where it's located in memory. If you could
find out the offset of any known point in your procedure,
you'd "have your bearings" so to speak, and could go on.
The trick is as follows. The 8088 CALL instruction pushes the
address of the next instruction onto the stack, then branches
to the location given in the CALL. By performing a dummy
CALL, then stealing the value off the stack, we have the
location of a known spot in the procedure. By subtracting
the offset within the procedure of that known location, we get
a pointer to the beginning of the procedure which can be used
to access everything else. Here's an example:
72
LOCAL PROC NEAR
PUSH BP ;set up to access stack
MOV BP, SP ; ditto
CALL DUMMY ;establish addressability
DUMMY POP BX ; " "
SUB BX, OFFSET(DUMMY)
MOV AX, 4[BP] ;get parameter
SEG CS ;offset relative to CS
ADD OFFSET(TOTAL)[BX], AX ;maintain running total
POP BP
RET 2
TOTAL DW 0000H
ENDP
We use indirect addressing here. After the funny business
with the CALL, POP, SUB sequence, BX has a pointer to the
beginning of the procedure. Using indirect addressing, we
take that pointer and add in the offset of the memory location
we want to access. In this example we're using a local
variable to maintain a running total of the parameter which
gets passed.
Note the SEG CS just before the ADD which accesses the
location TOTAL. Since we found our bearings by stealing the
address of a program instruction, our offset is known relative
to the CS register, NOT the DS which is normally used to
access data. The SEG CS forces the 8088 to calculate the
address using CS rather than the default DS register. Every
time you access a memory location within your procedure, you
*MUST* do it relative to CS by using a segment override.
Turbo requires that you preserve the values of the following
registers:
BP, CS, DS, SS
If you want to use these registers in your routine, the
easiest way to preserve them is to PUSH them onto the stack at
the beginning of your routine, then POP them just before
returning. As near as I can tell, you can safely trash the
other registers.
73
2. INLINE Code
An alternative method of incorporating code into Turbo Pascal
is through Turbo's inline statement. The inline statement is
intended for short routines or patches where you just give
Turbo a list of numbers representing the code. In addition to
numbers, you can also include variable names in the inline
list. Turbo replaces them with their offsets at compile time.
CHASM's INLINE pseudo-op is provided to facilitate producing
Turbo inline code. If you put an INLINE pseudo-op in your
CHASM source file, rather than producing a normal object code
file, CHASM produces a text file formatted to include in a
Turbo inline statement. For example:
inline ;shift to inline mode
mov ah, 0FH ;call bios for video mode
int 10H ; ditto
produces the following object file:
{ inline ;shift to inline mode }
$B4/$0F/ { mov ah, 0FH ;call bios for video mode}
$CD/$10/ { int 10H ; ditto }
Object code is output in text form, as hex constants. A
comment with the source code for each line is also generated.
The INLINE pseudo-op can appear anywhere in your source file,
and it requires no operands.
Object files produced from source files with an INLINE
pseudo-op are NOT executable! They contain text suitable for
inclusion in Turbo Pascal inline statements. It's probably a
good idea to override CHASM's default name for the object
file, and specify something with an extension other than COM
to prevent DOS from trying to run the program.
Your inline code must preserve the BP, SP, DS and SS
registers. If you need to modify these registers, you should
PUSH the ones you need at the beginning of your code, and POP
them at the end.
74
Turbo's inline statement allows you to insert variable names
in with the list of numbers. At compile time, Turbo will
replace the name with the 16 bit offset of the variable in its
native segment. (See the Turbo Pascal manual for a discussion
of internal data formats and the native segment of variables.)
CHASM supports this capability with the TURBO() function.
CHASM treats TURBO() as 16 bit immediate data during assembly.
However, in place of the two bytes of data, CHASM outputs the
function argument literally for Turbo to evaluate:
inline
mov bl, turbo(flag)[bp] ;local variable "flag"
mov ax, cs:[turbo(maxcode)] ;typed constant "maxcode"
lds si, turbo(y)[bp] ;pointer to var parameter "y"
produces:
{ inline }
$8A/$9E/FLAG/ { mov bx, turbo(flag)[bp] }
$2E/$A1/MAXCODE/ { mov ax, cs:[turbo(maxcode)] }
$C5/$B6/Y/ { lds si, turbo(y)[bp] }
75
>>Notes for Those Upgrading to This Version of CHASM<<
CHASM is not yet carved in stone - improvements and corrections
are made fairly frequently, based on both my own experience in
using the program, and the comments of outside users. This
section summarizes the changes which have been made since version
1.2 was released. Changes followed with an asterisk (*) denote
modifications which could invalidate programs written under
earlier versions of CHASM.
Version Notes
4.07 Speed enhancement. Intersegment JMP and CALL fixed.
Expressions now allowed in the subset.
4.06 Speed enhancement. Turbo Pascal INLINE facility.
/VIDEO configuration switch. New conditionals: IFREG,
IFREG8, IFREG16, IFSEGREG. New pseudo-ops: INLINE,
RADIX, CHKJMP, NOCHKJMP. New function: TURBO(). Single
quotes now allowed in strings (use two). Obsolete
COUNT, ENDC and DM pseudo-ops no longer supported. (*)
4.05 Speed enhancement. Characters now work in expressions.
EJECT and EJECTIF appear *before* page breaks. /DPAGE
configuration switch. Parameter collection on macro
invocations now halts before comments. SHL can now
also be written as SAL.
4.04 RET with displacement is now assembled correctly.
Parser bug fixed which crashed on strings with '/'.
4.03 Assembler variables. Operand size incompatibility now
reported. FSUB now assembles correctly. Intermittent
bug in expressions with forward references fixed.
Indirect memory references with offsets and segment
overides now assemble correctly. Only the first
occurrence of a phase error is now reported.
4.02 DS storage assembly is now turned off during structure
definitions. Structures assemble faster. Intermittent
problem with macro parameter expansion cleared up.
4.01 8087 support. WAITON and WAITOFF pseudo-ops. Internal
labels for macros. Corrects bug which caused crash on
certain syntax errors.
76
4.00 Total rewrite in Turbo Pascal. Subset replaces
interpreted version as free distribution release.
Faster assembly. Symbol and macro tables now use all
available memory. Nested macros permitted. Operand
expressions. $ now returns location counter value. New
syntax for segment overrides. Conditional pseudo-ops
now evaluate operands. Improved error and diagnostic
messages. DOS 2 or later now required (*). DOS 2 path
support for files. Setting of DOS 2 ERRORLEVEL. EJECTIF
pseudo-op added. New configuration switches: /DWIDTH,
/FF /TIMER, /PATH. CHASM.CFG must now have only one
switch per line (*).
3.15 Macros added. New pseudo-ops: MACRO, ENDM, IFE, IFNE,
IFB, IFNB, ELSE, ENDIF.
3.14 Interpreted version frozen at version 2.13, further
changes apply only to compiled version. Memory
requirement raised to 128K. INCLUDE, STRUC, ENDSTRUC,
DM, DW, LIST, NOLIST, COUNT and ENDC pseudo-ops added.
Alternate mnemonics for the jump on condition
instructions, and alternate syntax for the DIV and MUL
instructions. Binary numbers added.
2.13 Assembly can now be aborted with the Esc key. Negative
decimal numbers are working again. Input lines now
limited to 80 characters, and labels must begin with a
non-numeral. (*)
2.12 Listings can now be suppressed. Error messages echoed to
the console on non-screen listings. Expert mode added.
2.11 Pagination improved. Listings now time stamped. OFFSETs
and word values now allowed in DB operand list.
2.10 Equated symbols allowed in the DB operand list. Status
line improved.
2.09 The first digit of hexadecimal constants must now be in
the range 0-9. A leading zero is permitted on four digit
hex constants, to allow fulfilling this condition. (*)
2.08 Configuration process expanded. CHASM now skips over
perforations on printed listings. EJECT pseudo-op added.
2.07 Oops. Configuration file now works as advertised.
77
2.06 CHASM now supports reverse long jumps.
2.05 Compiled version released. BSAVE pseudo-op.
Configuration process simplified.
2.04 TABs are now expanded and replaced with blanks, for
compatibility with IBM text editors.
2.03 Two bugs corrected. The first bug involved incorrect
assembly of indirect memory references which used a
displacement in the range 128-255. The second caused a
program crash if a hex number longer than 4 digits was in
found in the input file.
2.01 COM2DATA utility added.
2.00 Corrected a bug in the DS and DB pseudo-ops which caused
the last label in a program to be redefined as a memory
location. Also, the TAB character was added as a new
delimiter, and PRIMER.DOC was added to the CHASM package.
1.9 The short jump is now represented with mnemonic JMPS, for
compatibility with DEBUG version 1.1. (*)
1.8 The operand type "character" was added as a new way to
represent immediate data.
1.7 The DS operator now works for blocks larger than 255
bytes. Also, the OFFSET function now works properly in
the displacement field of an indirect memory reference.
1.6 A revision of this document. Some sections were improved
slightly, and in response to user requests, a section on
execution of assembled programs was added.
1.5 Corrected an error which generated the message "Data too
Long" if the value FFH was used as 8 bit immediate data.
1.4 User interface improved. CHASM now traps some common
input errors such as misspelling a file name, or
forgetting to turn on your printer.
1.3 A speed enhancement. Version 1.3 benchmarks about 5
times faster than version 1.2.
78
>>Miscellaneous and A Word From Our Sponsor...<<
A. Programming Notes:
1. CHASM is written in a combination of Turbo Pascal and
CHASM. This is less incestuous than it sounds: the
program was written in Turbo, then profiled to single
out the critical routines. The rate determining sections
were rewritten in optimized assembly language, and
assembled with the original Turbo Pascal version of CHASM.
These routines were then incorporated as Turbo external
procedures and functions in a new version of CHASM.
The speed enhancements possible by "helping out" Turbo with
CHASM can be quite dramatic. Replacing four rate limiting
routines out of the over two hundred routines in CHASM gave
almost a four-fold speed increase!
CHASM's source code is available to registered users by
sending a formatted disk and stamped return mailer. If you
make any improvements, I'd like to hear about them for
possible inclusion in future releases.
Please note that although you can modify CHASM for your own
use, under NO CIRCUMSTANCES may you distribute modified or
translated versions, either in the public domain or for
profit.
B. Red Tape and Legal Nonsense:
1. Disclaimer:
CHASM is distributed as is, with no guarantee that it will
work correctly in all situations. In no event will the
Author be liable for any damages, including lost profits,
lost savings or other incidental or consequential damages
arising out of the use of or inability to use these programs,
even if the Author has been advised of the possibility of
such damages, or for any claim by any other party.
Despite the somewhat imposing statement above, it *is* my
intention to fix any bugs which are brought to my attention.
See the appendix on Bug Reporting for more details.
79
2. Copyright Information:
The entire CHASM distribution package, consisting of the
main program, documentation files, and various data and
utility files, is copyright (c) 1983, 1984, 1985 and 1986
by David Whitman. The author reserves the exclusive right
to distribute this package, or any part thereof, for
profit. The name "CHASM (tm)", applied to a microcomputer
assembler program, is a trademark of David Whitman.
CHASM's Subset version and various subsidiary files may be
copied freely by individuals for evaluation purposes. It
is expected that those who find the package useful will
make a contribution directly to the author of the program.
The Subset version identifies itself by displaying a banner
page giving the author's address and inviting free copying.
ONLY VERSIONS DISPLAYING THIS BANNER PAGE MAY BE COPIED.
CHASM's Advanced version is only available to registered
users who have made the $40 suggested payment. Registered
users may copy the program for backup purposes, but must
restrict use of the program to either one user or one CPU,
at their option.
CHASM's source code is made available for educational
purposes and to allow users to customize for their own
personal use. Under NO CIRCUMSTANCES may modified versions
or translations into other computer languages be
distributed, either in the public domain or for profit.
User groups and clubs are authorized to distribute CHASM's
Subset version under the following conditions:
1. No charge is made for the software or documentation. A
nominal distribution fee may be charged, provided that
it is no more than $8 total.
2. Recipients are to be informed of the user-supported
software concept, and encouraged to support it with
their donations.
3. The program and documentation are not modified in ANY
way, and are distributed together.
80
Interested manufacturers are invited to contact Whitman
Software to discuss licensing CHASM for bundling with
MS-DOS based computer systems.
Distribution of CHASM outside the United States is through
licensed distributors, on a royalty basis. Interested
distributors are invited to contact Whitman Software.
3. Royalty Information:
No royalties are required to distribute programs produced
using CHASM. However, if you send me a copy of any major
program you have produced using CHASM, I'll give you a free
page of advertising in this document.
4. Educational Discount:
Substantial discounts are available for multi-CPU licenses
of CHASM's Advanced version to educational institutions.
Contact Whitman Software for details.
C. An Offer You Can't Refuse.
CHASM is User-Supported software, distributed under a
modification of the FREEWARE (tm) marketing scheme developed by
the late Andrew Fluegelman, whose inspiration and efforts are
gratefully acknowledged.
Anyone may obtain a free copy of CHASM's Subset version by
sending a blank, formatted diskette to the author. An
addressed, postage-paid return mailer must accompany the disk
(no exceptions, please).
A copy of the program, with documentation, will be sent by
return mail. The program will carry a notice suggesting a
payment to the program's author. Making the payment
is totally voluntary on the part of the user. Regardless of
whether a payment is made, the user is encouraged to
share the program with others. Payment for use is
discretionary on the part of each subsequent user.
81
The underlying philosophy here is based on three principles:
First, that the value and utility of software is best assessed
by the user on his/her own system. Only after using a
program can one really determine whether it serves personal
applications, needs, and tastes.
Second, that the creation of independent personal computer
software can and should be supported by those who benefit
from its use. Remember the Tanstaafl principal: There
Ain't No Such Thing as a Free Lunch.
Finally, that copying and networking of programs should be
encouraged, rather than restricted. The ease with which
software can be distributed outside traditional commercial
channels reflects the strength, rather than the weakness,
of electronic information.
If you like this software, please help support it. Your
support can take three forms:
1. Become a registered user. The suggested payment for
registration is $40.
2. Suggestions, comments, and bug reports. Your comments will
be taken seriously - user feedback was responsible for most
of the changes listed in CHASM's revision history.
3. Spread the word. Make copies of the Subset for friends.
Write the editor of your favorite computer magazine.
Astronomical advertising costs are one big reason that
commercial software is so overpriced. To continue offering
CHASM this way, I need your help in letting other people
know about CHASM.
82
Those who make the $40 payment to become registered users
receive the following benefits:
1. An upgrade to the Advanced version of the program. The
Advanced version executes twice as fast as the Subset and
supports macros, conditional assembly and other features.
An order form for the Advanced version is given at the end
of this manual.
2. User support, by phone or mail. Support is only
available to registered users. Phone numbers and the
address for help are given below.
3. Notices announcing the release of significant upgrades.
CHASM is copyrighted, and users are requested NOT to make copies
of the Advanced version other than for their own use. I am
strongly opposed to copy protection, and would regret being
forced to protect CHASM. Please recognize the amount of time and
money which went into producing CHASM, and respect the wishes of
the author.
David Whitman
P.O. Box 1157
North Wales, PA 19454
(215) 641-7522 (days)
(215) 234-4084 (evenings)
83
Appendix A: 8088 Mnemonic List
This appendix lists the mnemonics which CHASM will recognize,
grouped roughly by function. Consult "The 8086 Book" for
definitions of these instructions, and for the operands each will
accept. Mnemonics marked with an asterisk (*) will accept a 'B'
or 'W' suffix for ambiguous memory references.
Arithmetic:
AAA AAD AAM AAS ADC* ADD* CBW
CWD CMP* CMPS* DAA DAS DEC* DIV*
IDIV* IMUL* INC* MUL* NEG* SBB* SUB*
Data Movement:
LAHF LDS LEA LES LODS* MOV* MOVS*
POP POPF PUSH PUSHF SAHF XCHG XLAT
Logical:
AND* NOT* OR* TEST* XOR*
String Primitives:
CMPS* LODS* MOVS* SCAS* STOS*
Instruction Prefixes:
LOCK REP REPE REPNE REPNZ REPZ SEG
Program Counter Control: (unconditional)
CALL CALLN CALLF JMP JMPF JMPN JMPS
RET
Program Counter Control: (conditional)
JA JAE JB JBE JC JCXZ JE
JG JGE JL JLE JNA JNAE JNB
JNBE JNC JNE JNG JNGE JNL JNLE
JNO JNO JNP JNS JNZ JO JP
JPE JPO JS JZ LOOP LOOPE LOOPNE
LOOPNZ LOOPZ
84
Processor Control:
CLC CLD CLI CMC HLT NOP STC
STD STI WAIT
I/O:
IN OUT
Interrupt:
INT INTO IRET
Rotate and Shift:
RCL* RCR* ROL* ROR* SAL* SAR* SHL*
SHR*
85
Appendix B: 8087 Mnemonic List
===> Advanced version only.
This appendix lists the 8087 mnemonics recognized by CHASM's
Advanced version, grouped roughly by function.
Arithmetic:
FADD FADDP FCHS FDIV FDIVP FDIVR FDIVRP
FIADD FIDIV FIDIVR FIMUL FISUB FISUBR FMUL
FMULP FPREM FSUB FSUBP FSUBR FSUBRP
Mathematical Functions:
F2XM1 FABS FPATAN FPTAN FRNDINT FSCALE FSQRT
FXTRACT FYL2X FYL2XP1
Data Movement:
FBLD FBSTP FILD FIST FISTP FLD FLD1
FLDL2E FLDL2T FLDLG2 FLDLN2 FLDPI FLDZ FST
FSTP FXCH
Comparison:
FCOM FCOMP FCOMPP FICOM FICOMP FTST FXAM
Processor Control:
FCLEX FDECSTP FDISI FENI FFREE FINCSTP FINIT
FNCLEX FNDISI FNENI FNINIT FNOP WAIT
Processor Status:
FLDCW FLDENV FNSAVE FNSTCW FNSTENV FNSTSW FRSTOR
FSAVE FSTCW FSTENV FSTSW
86
Appendix C: Differences Between CHASM and That Other Assembler
Virtually all magazine articles about assembly language
programming on the IBM PC assume that the reader is using That
Other Assembler - you know, the outrageously priced one. This
appendix will try to summarize the differences between the two
programs. Please note that I do not own a copy of That Other
Assembler, and therefore this section is not complete, nor even
guaranteed to be correct. I continue to work on this section,
and anyone with more experience is invited to make additions or
corrections, so this section will continually improve with time.
A. General Differences
The biggest difference is philosophical. The IBM assembler was
designed for use by professional assembly language programmers,
to write operating systems and other huge projects. This is
reflected in the large size and relative complexity of the macro
assembler.
On the other hand, CHASM was designed for use by beginners, to
write relatively short programs. This was done by leaving out
some of the power offered by IBM's assembler, in exchange for
simplicity and ease of use. The main simplification involved
producing object code in the COM format, rather than the EXE
format chosen by IBM. There are two main consequences of this
choice:
1. You can't link routines assembled by CHASM to
Microsoft languages. (Although you *can* include them in
Turbo Pascal or BASIC programs.)
2. Your program has to fit in one 64K segment. If (shudder!)
you want to write a 256K assembly language program, you're
out of luck.
Like Pascal, the IBM assembler is a strongly typed language. By
requiring you to specify the *type* of each memory location you
will access in your program, the IBM assembler generally knows
what size of memory operand you want. If you don't like the
declared size, you have to override the default with the PTR
operator. Thus, to loading the AL register from a location
declared word is a syntax error, unless you specify BYTE PTR
before the address.
87
In analogy to the C language, CHASM is weakly typed. CHASM is
perfectly happy extracting a byte from where you originally set
aside a word - CHASM can't tell the difference. In most cases,
CHASM can tell what size you want from context: for example, if
you're using a word register, it *must* be a word memory access. On
the other hand, for any access to memory which doesn't have a
register as the other operand, you must add either a 'B' or a 'W'
to the instruction mnemonic used by IBM.
B. Miscellaneous Differences:
1. Short Jumps:
IBM uses the SHORT keyword, CHASM uses an 'S' suffix.
Example:
JMP SHORT label ;ibm
JMPS label ;chasm
2. Offset Function:
Where IBM precedes an operand with the keyword OFFSET,
CHASM has a *function* called OFFSET. CHASM requires
parentheses around the operand. Example:
MOV AX, OFFSET FCB ;ibm
MOV AX, OFFSET(FCB) ;chasm
3. Declaring Storage:
A. If you don't care what value a memory location is
initialized to, the IBM assembler allows you to specify
'?' as its contents. In CHASM, if you don't care what
value the variable is initialized to, just put down a
zero. Example:
DB ? ;ibm
DB 0 ;chasm
88
B. The IBM assembler allows the keyword DUP as an operand
in storage declaring pseudo-ops. This means to repeat
the definition as many times as the number just before
the DUP. Example:
DW 3 DUP(?) ;ibm
DW 0, 0, 0 ;chasm
4. ASSUME Pseudo-op:
IBM's ASSUME pseudo-op tells the assembler where the segment
registers will be pointing. CHASM always assumes that the CS,
DS and SS registers point to the beginning of the code
segment, and that the SS register has been set up to point to
a valid stack area. If you find an ASSUME pseudo-op with
different assumptions for the CS, DS and ES registers you'll
have to figure out the addresses for memory references in the DS
and ES segments yourself.
5. Segment Pseudo-op:
This pseudo-op is used to set up multiple segments in the IBM
assembler. Since CHASM only allows one segment, there is no
equivalent pseudo-op. If there is only one segment definition
in an IBM assembler program, everything is fine, just leave
the pseudo-op out for CHASM.
Often times the SEGMENT pseudo-op is used to provide
addressing of an area in the BIOS, or perhaps the interrupt
vector table at the beginning of memory. For example, if a
program needed to get at the BIOS data area, in the IBM
assembler you would define a dummy segment with the same
structure as that in the BIOS listing in Technical Reference:
DATA SEGMENT AT 40H
RS232_BASE DW 4 DUP(?)
PRINTER_BASE DW 4 DUP(?)
EQUIP_FLAG DW ?
MFG_TST DB ?
MEMORY_SIZE DW ?
IO_RAM_SIZE DW ?
All this is really accomplishing is giving a name to some
memory locations which are outside the actual program being
written.
89
In CHASM, you can simulate the dummy segment using a
structure. This will generate a series of immediate operands
whose values correspond to the offsets of the labels in the
dummy segment. You can then reference the locations by
enclosing the label names in square brackets, to coerce from
type immediate to type address.
DUMMY STRUC ;simulate dummy segment
RS232_BASE DW 0, 0, 0, 0
PRINTER_BASE DW 0, 0, 0, 0
EQUIP_FLAG DW 0
MFG_TST DB 0
MEMORY_SIZE DW 0
IO_RAM_SIZE DW 0
ENDSTRUC
MOV AX, [EQUIP_FLAG]
6. Labels:
The macro assembler indicates a local label by appending a
colon (:). The colon does not become part of the label, and
is not included when referencing the label. CHASM's labels
are all global, and although they may end with a colon, the
colon will become part of the label itself, and must then be
used when referencing the label. Example:
a2: mov ax,cx ;ibm
jmp a2 ; "
a2: mov ax,cx ;chasm
jmp a2: ; "
CHASM does provide support for local labels in macros. See the
discussion on internal labels in the macro section of this
document.
7. Entry Point:
The macro assembler allows you to specify the point within
your program where execution will begin. A label is put
on the entry point, then to indicate entry, the same label is
placed on the "END" pseudo-op. Since COM programs must always
start at offset 100H, CHASM doesn't allow setting an entry
point, or use the END pseudo-op.
90
Appendix D: Description of Files
Your CHASM distribution disk contains a number of files. This
appendix will give a brief statement of the purpose of each.
FILE DESCRIPTION
----------------------------------------------------------------
CHASM.CFG Sample configuration file, for IBM printer.
CHASM.COM The CHASM program (either subset or advanced)
CHASMS.COM This file may appear on the disk of advanced
version users. It is the current subset version,
that can be shared with others. Please rename it
CHASM.COM on their disk, to avoid confusion.
CHASM.DOC This document.
EXAMPLE.ASM Sample source file.
COM2DATA.ASM Source code for COM2DATA filter.
COM2DATA.DOC Documentation for COM2DATA.
FREEWARE.DOC References to other User Supported programs.
PRIMER.DOC Simple introduction to assembly language.
Occasionally, various other sample source files for CHASM will be
distributed. These files will have extension ASM, and will be
accompanied by a corresponding DOC file.
91
Appendix E: Bug Reporting Procedure
Although each version of CHASM is tested extensively prior to
release, any program of this magnitude is bound to contain a few
bugs. It is the intention of Whitman Software to correct any
genuine problem which is reported.
If you think you have found a bug in CHASM, please take the time
to report it for correction. Although any report is helpful,
correction of the problem will be easiest if you provide the
following:
1. The version of CHASM you are using. Your problem may have
been fixed already!
2. A brief description of what you believe the problem to be.
3. A printed listing of a source file which manifests the
problem.
* DON'T send a 5,000 line program which has one
manifestation of the bug! Isolate the problem area, or
write a short sample routine that demonstrates the bug.
Unlike normal commercial software, where corrections are saved up
for a major revision, bugs in CHASM are fixed as soon as reported,
with a new version released almost immediately (which is why there
are so many versions in CHASM's revision history).
=====> BONUS <======
If you send a copy of your problem source file on disk, it will
be returned with either a new, corrected version of CHASM, or
with an explanation of what you were doing wrong to *think* you'd
found a bug.
92
Appendix F: Using CHASM on "Compatible" Systems
CHASM was written specifically for the IBM PC, but should
function normally on true "compatibles". This appendix is a new
section to summarize compatibility data for various systems.
Since CHASM version 4 is a totally new program, little
compatibility data is currently available. If you are using (or
are unable to use...) CHASM on a non-IBM computer, please write
with your experiences. Does CHASM work correctly on your system?
Are there specific problem areas? Can they be worked around?
If you are using a non-IBM system, I strongly recommend that at
least to start out, you include the line:
/VIDEO 2
in your CHASM.CFG file (see the section on "Modifying CHASM's I/O
Defaults" for a discussion of the CHASM.CFG file). This will
force CHASM to use BIOS calls to access your video screen. By
default, CHASM writes directly to the screen hardware for maximum
speed. Without this line, if your hardware is not *strictly* IBM
compatible, CHASM's output could be invisible, or your system
could even hang up and require re-booting.
The following systems are reported to run CHASM version 4
successfully:
AT&T 6300
Chameleon
Columbia 1600-1
Compaq, Compaq DeskPro, Compaq Plus
Corona PC-2
Heath 151
IBM PC, XT, AT, 3270 PC, 3270 PC/G
ITT XTRA
Kaypro 16
Leading Edge
Mega XT
PC Designs FD-1
Sanyo 555
Superior PC
Tandy 1000, 2000
Televideo 1605
Zenith 150
93
The following systems have one or more problems:
================================================================
Tandy 1200HD - CHASM crashes on some systems. The problem seems
to be related to the exact amount of free memory available on the
system. You can prevent the crash by slightly changing the
amount of free memory, to either a higher or lower number.
Probably the easiest way to do this is to change the number of
disk buffers, or load an extra device driver such as ANSII.SYS.
Call Whitman Software if you have trouble.
================================================================
================================================================
IBM PCjr. - Several users report getting as far as the source file
prompt, at which point the program crashes. Several other users
(invariably with more than 128K of memory) report that CHASM works
fine. The PCjr uses part of user memory for the screen buffer, and
PCjr users probably need 192K to run CHASM.
================================================================
94
********ADVANCED VERSION ORDER FORM********
Please add me to the list of registered CHASM users, and send me
an upgrade to Advanced CHASM. I understand that CHASM is
copyrighted, and agree not to distribute any unauthorized copies
of this Advanced version.
Note that version 4 requires DOS 2 (or later)
and 128K of memory. (192K for PCjr)
Computer Model: ____________________________________
Diskette format: Total Memory: _______K
__ single sided/9 sector
__ doubled sided/9 sector
Check one:
___ I enclose a check for $40
___ I am a past customer. The enclosed check brings my
total payment up to $40.
Where did you hear about CHASM? ________________________________
Name: _______________________________________________________
Address: _______________________________________________________
City, State, Zip: ______________________________________________
================================================================
Send order form and check to:
Whitman Software
P.O. Box 1157
North Wales, PA 19454
95
==============PRINTER ENHANCEMENT===================
Michael Hoyt, of Soft and Friendly Software, has produced a set
of printer enhancement programs using CHASM, which is sold under
the name Prowriter Utilities. The package supports the following
printers:
NEC 8023A-C
Prowriter I (C. Itoh 8510)
Prowriter II (C. Itoh 1550)
The package contains three programs:
PRINT_CHARACTERS
PRINT_SCREEN
PRINT_SET
Once PRINT_CHARACTERS is run, it attaches itself to DOS, and
makes your printer have exactly the same character set as your
video monitor. The conversion is very professionally done.
Particularly impressive are the line drawing characters, which
actually form connected lines, both horizontally and vertically.
As if this wasn't enough, PRINT_CHARACTERS adds italics
capability as well. The italics make very effective emphasis in
documents and letters, and look really good.
PRINT_SCREEN is a graphics screen dump, activated by the normal
Shift/PrtSc sequence. Several options are available which trade
off speed and print quality. Since I have the mono card, I
haven't tried PRINT_SCREEN, but Michael sent me a sample printout
which looked quite nice.
PRINT_SET is a menu-driven program to turn on and off the various
special printing modes supported by these printers. A simple but
effective program.
I've been using this package with my NEC 8023 for a few months
now, and I like them quite a bit. To get a copy, send $35 to:
Soft and Friendly
RR 2 Box 65
Solsberry, IN 47459
96
================= NEW PRODUCT ==================
If you use the IBM/Microsoft BASIC compiler, chances are your
programs are bigger and slower than they have to be. If all
unreferenced line numbers are removed from your source program,
and the /N switch is used, BASCOM will "optimize" your program.
The result is tighter, more efficient code.
NUMZAP is a utility which carefully scans your source file, and
deletes all the non-essential line numbers. Performing this task
by hand would be prohibitively time consuming and you'd probably
introduce errors into your program in the process. NUMZAP will
do the job in minutes, 100% error free.
The old BASIC version of CHASM was passed through NUMZAP, and the
resulting compiled code shrank by a factor of 10% (!). That 10%
reduction could make the difference between your program running
in 64K, or having users with minimal systems get "Out of Memory"
messages just before your program crashes.
An added advantage to using NUMZAP is that bigger programs can be
compiled. You may not be aware that there is a limit on the size
of program which the compiler can handle. BASCOM uses up space
remembering the offset of each line number in your program. If
you have too many numbered lines, BASCOM will run out of room and
you'll get a unending series of "TC" (Too Complex) error
messages. By eliminating the unneeded line numbers, you give
BASCOM more elbow room. The free space available to compile
CHASM increased 27% (!) after using NUMZAP.
NUMZAP is available under the standard FREEWARE deal - just send
a formatted disk and self-addressed, stamped return mailer to:
David Whitman
P.O. Box 1157
North Wales, PA 19454
Be sure to specify that you are interested in NUMZAP. If you
like the program, a donation of $15 is suggested.